English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 848/2341 (36%)
造访人次 : 5042061      在线人数 : 54
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    主页登入上传说明关于TFIR管理 到手机版


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://163.15.40.127/ir/handle/987654321/1447


    题名: Personal Spoken Sentence Retrieval Using Two-Level Feature Matching and MPEG-7 Audio LLDs
    作者: Lin, Po-Chuan
    Wang, Jhing-Fa
    Wang, Jia-Ching
    Huang, Jun-Jin
    林博川
    王駿發
    王家慶
    黃俊憬
    (東方技術學院電子與資訊系)
    贡献者: 東方技術學院電子與資訊系
    关键词: Audio low level descriptors
    MPEG-7
    spoken sentence
    retrieval
    feature-based comparison
    matching algorithm
    日期: 2009-07
    上传时间: 2012-11-21 10:23:31 (UTC+8)
    摘要: Conventional spoken sentence retrieval (SSR) relies on a large-vocabulary continuousspeech
    recognition (LVCSR) system. This investigation proposes a feature-based speakerdependent
    SSR algorithm using two-level matching. Users can speak keywords as the query inputs to get the similarity ranks from a spoken sentence database. For instance, if a user is looking for a relevant personal spoken sentence, “October 12, I have a meeting in New York” in the database, then the appropriate query input could be “meeting”, “New York” or “October”. In the first level, a Similar Frame Tagging scheme is proposed to locate possible segments of the database sentences that are similar to the user’s query utterance. In the second level, a Fine Similarity Evaluation between the query and each possible segment is performed. Based on the feature-based comparison, the proposed algorithm does not require acoustic and language models, thus our SSR algorithm is language independent. Effective feature selection is the next issue in this paper. In addition to the conventional mel frequency cepstrum coefficients (MFCCs), several MPEG-7 audio lowlevel descriptors (LLDs) are also used as the features to exploit their ability for SSR. Experimental results revealed that the retrieval performance using MPEG-7 audio LLDs was close to that of the MFCCs. The combination of MPEG-7 audio LLDs and the MFCCs could further improve the retrieval precision. Based on the feature-based matching, the proposed algorithm has the advantages of language independent and speaker dependent training free. Comparing to the original methods [10, 11], with only 0.026 ~ 0.05 precision decrease, the addition and multiplication numbers are reduced by around a factor of lq (frame number of query). It is particularly suitable for the use in resource-limited devices.
    關聯: Journal of Information Science and Engineering, Vol.25 no.4, pp.1221-1238
    显示于类别:[電子與資訊系(遊戲動畫系、動畫科)] 期刊論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML870检视/开启


    在TFIR中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈