This paper examines system combination issue for Syllable-confusion-network (SCN) -based Chinese Spoken term detection (STD). System combination for STD usually leads to improved accuracy but suffers from increased index size or complicated index structure. But in the scenarios where the index size and search speed are critical, a single compact index is highly desirable. Therefore we explore methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness of the indices. First, a composite SCN is generated using two approaches: lattice combination and confusion network combination. Then a simple compact index is constructed from this composite SCN by merging cross-system redundant information. The experimental result on a 60-hour corpus shows that a relative accuracy improvement of 16.20% is achieved over the baseline syllable-based system. Meanwhile, it reduces the index size by 22.3% compared to the commonly adopted score combination method under comparable accuracy.