美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇

代寫COM6511、代做Python設計程序

時間:2024-04-30  來源:  作者: 我要糾錯



COM4511/COM6511 Speech Technology - Practical Exercise -
Keyword Search
Anton Ragni
Note that for any module assignment full marks will only be obtained for outstanding performance that
goes well beyond the questions asked. The marks allocated for each assignment are 20%. The marks will be
assigned according to the following general criteria. For every assignment handed in:
1. Fulfilling the basic requirements (5%)
Full marks will be given to fulfilling the work as described, in source code and results given.
2. Submitting high quality documentation (5%)
Full marks will be given to a write-up that is at the highest standard of technical writing and illustration.
3. Showing good reasoning (5%) Full marks will be given if the experiments and the outcomes are explained to the best standard.
4. Going beyond what was asked (5%)
Full marks will be given for interesting ideas on how to extend work that are well motivated and
described.
1 Background
The aim of this task is to build and investigate the simplest form of a keyword search (KWS) system allowing to find information
in large volumes of spoken data. Figure below shows an example of a typical KWS system which consists of an index and
a search module. The index provides a compact representation of spoken data. Given a set of keywords, the search module
Search Results
Index
Key− words
queries the index to retrieve all possible occurrences ranked according to likelihood. The quality of a KWS is assessed based
on how accurately it can retrieve all true occurrences of keywords.
A number of index representations have been proposed and examined for KWS. Most popular representations are derived
from the output of an automatic speech recognition (ASR) system. Various forms of output have been examined. These differ
in terms of the amount of information retained regarding the content of spoken data. The simplest form is the most likely word
sequence or 1-best. Additional information such as start and end times, and recognition confidence may also be provided for
each word. Given a collection of 1-best sequences, the following index can be constructed
w1 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
w2 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )

wN (fN,1, sN,1, eN,1) . . . (fN,nN , sN,nN , eN,nN )
(1)
where wi is a word, ni is the number of times word wi occurs, fi,j is a file where word wi occurs for the j-th time, si,j and ei,j
is the start and end time. Searching such index for single word keywords can be as simple as finding the correct row (e.g. k)
and returning all possible tuples (fk,1, sk,1, ek,1), . . ., (fk,nk , sk,nk , ek,nk ).
The search module is expected to retrieve all possible keyword occurrences. If ASR makes no mistakes such module
can be created rather trivially. To account for possible retrieval errors, the search module provides each potential occurrence
with a relevance score. Relevance scores reflect confidence in a given occurrence being relevant. Occurrences with extremely
low relevance scores may be eliminated. If these scores are accurate each eliminated occurrence will decrease the number of
false alarms. If not then the number of misses will increase. What exactly an extremely low score is may not be very easy
to determine. Multiple factors may affect a relevance score: confidence score, duration, word confusability, word context,
keyword length. Therefore, simple relevance scores, such as those based on confidence scores, may have a wide dynamic range
and may be incomparable across different keywords. In order to ensure that relevance scores are comparable among different
keywords they need to be calibrated. A simple calibration scheme is called sum-to-one (STO) normalisation
(2)
where ri,j is an original relevance score for the j-th occurrence of the i-th keyword, γ is a scale enabling to either sharpen or
flatten the distribution of relevance scores. More complex schemes have also been examined. Given a set of occurrences with
associated relevance scores, there are several options available for eliminating spurious occurrences. One popular approach
is thresholding. Given a global or keyword specific threshold any occurrence falling under is eliminated. Simple calibration
schemes such as STO require thresholds to be estimated on a development set and adjusted to different collection sizes. More
complex approaches such as Keyword Specific Thresholding (KST) yield a fixed threshold across different keywords and
collection sizes.
Accuracy of KWS systems can be assessed in multiple ways. Standard approaches include precision (proportion of relevant retrieved occurrences among all retrieved occurrences) and recall (proportion of relevant retrieved occurrences among all
relevant occurrences), mean average precision and term weighted value. A collection of precision and recall values computed
for different thresholds yields a precision-recall (PR) curve. The area under PR curve (AUC) provides a threshold independent summative statistics for comparing different retrieval approaches. The mean average precision (mAP) is another popular,
threshold-independent, precision based metric. Consider a KWS system returning 3 correct and 4 incorrect occurrences arranged according to relevance score as follows: ✓ , ✗ , ✗ , ✓ , ✓ , ✗ , ✗ , where ✓ stands for correct occurrence and ✗ stands
for incorrect occurrence. The average precision at each rank (from 1 to 7) is 1

7 . If the number of true correct
occurrences is 3, the mean average precision for this keyword 0.7. A collection-level mAP can be computed by averaging
keyword specific mAPs. Once a KWS system operates at a reasonable AUC or mAP level it is possible to use term weighted
value (TWV) to assess accuracy of thresholding. The TWV is defined by
 
(3)
where k ∈ K is a keyword, Pmiss and Pfa are probabilities of miss and false alarm, β is a penalty assigned to false alarms.
These probabilities can be computed by
Pmiss(k, θ) = Nmiss(k, θ)
Ncorrect(k) (4)
Pfa(k, θ) = Nfa(k, θ)
Ntrial(k) (5)
where N<event> is a number of events. The number of trials is given by
Ntrial(k) = T − Ncorrect(k) (6)
where T is the duration of speech in seconds.
2 Objective
Given a collection of 1-bests, write a code that retrieves all possible occurrences of keyword list provided. Describe the search
process including index format, handling of multi-word keywords, criterion for matching, relevance score calibration and
threshold setting methodology. Write a code to assess retrieval performance using reference transcriptions according to AUC,
mAP and TWV criteria using β = 20. Comment on the difference between these criteria including the impact of parameter β.
Start and end times of hypothesised occurrences must be within 0.5 seconds of true occurrences to be considered for matching.
2
3 Marking scheme
Two critical elements are assessed: retrieval (65%) and assessment (35%). Note: Even if you cannot complete this task as a
whole you can certainly provide a description of what you were planning to accomplish.
1. Retrieval
1.1 Index Write a code that can take provided CTM files (and any other file you deem relevant) and create indices in
your own format. For example, if Python language is used then the execution of your code may look like
python index.py dev.ctm dev.index
where dev.ctm is an CTM file and dev.index is an index.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.2 Search Write a code that can take the provided keyword file and index file (and any other file you deem relevant)
and produce a list of occurrences for each provided keyword. For example, if Python language is used then the
execution of your code may look like
python search.py dev.index keywords dev.occ
where dev.index is an index, keywords is a list of keywords, dev.occ is a list of occurrences for each
keyword.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.3 Description Provide a technical description of the following elements
• Index file format
• Handling multi-word keywords
• Criterion for matching keywords to possible occurrences
• Search process
• Score calibration
• Threshold setting
2. Assessment Write a code that can take the provided keyword file, the list of found keyword occurrences and the corresponding reference transcript file in STM format and compute the metrics described in the Background section. For
instance, if Python language is used then the execution of your code may look like
python <metric>.py keywords dev.occ dev.stm
where <metric> is one of precision-recall, mAP and TWV, keywords is the provided keyword file, dev.occ is the
list of found keyword occurrences and dev.stm is the reference transcript file.
Hint: In order to simplify assessment consider converting reference transcript from STM file format to CTM file format.
Using indexing and search code above obtain a list of true occurrences. The list of found keyword occurrences then can
be assessed more easily by comparing it with the list of true occurrences rather than the reference transcript file in STM
file format.
2.1 Implementation
• AUC Integrate an existing implementation of AUC computation into your code. For example, for Python
language such implementation is available in sklearn package.
• mAP Write your own implementation or integrate any freely available.
3
• TWV Write your own implementation or integrate any freely available.
2.2 Description
• AUC Plot precision-recall curve. Report AUC value . Discuss performance in the high precision and low
recall area. Discuss performance in the high recall and low precision area. Suggest which keyword search
applications might be interested in a good performance specifically in those two areas (either high precision
and low recall, or high recall and low precision).
• mAP Report mAP value. Report mAP value for each keyword length (1-word, 2-words, etc.). Compare and
discuss differences in mAP values.
• TWV Report TWV value. Report TWV value for each keyword length (1-word, 2-word, etc.). Compare and
discuss differences in TWV values. Plot TWV values for a range of threshold values. Report maximum TWV
value or MTWV. Report actual TWV value or ATWV obtained with a method used for threshold selection.
• Comparison Describe the use of AUC, mAP and TWV in the development of your KWS approach. Compare
these metrics and discuss their advantages and disadvantages.
4 Hand-in procedure
All outcomes, however complete, are to be submitted jointly in a form of a package file (zip/tar/gzip) that includes
directories for each task which contain the associated required files. Submission will be performed via MOLE.
5 Resources
Three resources are provided for this task:
• 1-best transcripts in NIST CTM file format (dev.ctm,eval.ctm). The CTM file format consists of multiple records
of the following form
<F> <H> <T> <D> <W> <C>
where <F> is an audio file name, <H> is a channel, <T> is a start time in seconds, <D> is a duration in seconds, <W> is a
word, <C> is a confidence score. Each record corresponds to one recognised word. Any blank lines or lines starting with
;; are ignored. An excerpt from a CTM file is shown below
7654 A 11.34 0.2 YES 0.5
7654 A 12.00 0.34 YOU 0.7
7654 A 13.30 0.5 CAN 0.1
• Reference transcript in NIST STM file format (dev.stm, eval.stm). The STM file format consists of multiple records
of the following form
<F> <H> <S> <T> <E> <L> <W>...<W>
where <S> is a speaker, <E> is an end time, <L> topic, <W>...<W> is a word sequence. Each record corresponds to
one manually transcribed segment of audio file. An excerpt from a STM file is shown below
2345 A 2345-a 0.10 2.03 <soap> uh huh yes i thought
2345 A 2345-b 2.10 3.04 <soap> dog walking is a very
2345 A 2345-a 3.50 4.59 <soap> yes but it’s worth it
Note that exact start and end times for each word are not available. Use uniform segmentation as an approximation. The
duration of speech in dev.stm and eval.stm is estimated to be 57474.2 and 25694.3 seconds.
• Keyword list keywords. Each keyword contains one or more words as shown below
請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp










 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:ACS341代做、代寫MATLAB設計程序
  • 下一篇:COMP 315代做、代寫Java/c++編程語言
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇
    精品无码一区二区三区| 国产高清在线免费观看| av天堂一区二区| 少妇影院在线观看| 下面一进一出好爽视频| 美女日批在线观看| 超碰caoprom| 免费一级做a爰片久久毛片潮| 99久久人妻无码中文字幕系列| 国产精品伦子伦| 九九九视频在线观看| 免费成人深夜蜜桃视频| 又大又长粗又爽又黄少妇视频| 日韩综合第一页| 国产在线综合视频| 三上悠亚 电影| av黄色在线免费观看| 国产午夜精品理论片在线| 久久久久亚洲av无码网站| 国产人成视频在线观看| www.涩涩爱| 小毛片在线观看| 91ts人妖另类精品系列| 免费看三级黄色片| 国产免费嫩草影院| av在线天堂网| 糖心vlog免费在线观看 | 亚洲熟女乱综合一区二区| 私密视频在线观看| 原创真实夫妻啪啪av| 97超碰在线免费观看| 欧美做爰爽爽爽爽爽爽| 亚洲码无人客一区二区三区| 国产人妻精品午夜福利免费| 中文天堂资源在线| 日韩成人av一区二区| 国产一区二区三区在线视频观看| 少妇大叫太粗太大爽一区二区| 丁香花五月激情| 在线视频第一页| 永久免费看mv网站入口78| 无码av免费精品一区二区三区| 日韩av手机在线免费观看| 日本美女xxx| 免费毛片视频网站| 玖玖爱在线观看| 中文字幕丰满孑伦无码专区| 午夜av免费看| 好男人香蕉影院| 伊人网综合视频| 呦呦视频在线观看| 青青草成人免费视频| 极品白嫩的小少妇| 黄色av电影网站| 国产精品一区二区在线免费观看| 91精品国产高清91久久久久久| 乳色吐息在线观看| 国产精品国产高清国产| 色诱av手机版| 亚洲熟女一区二区| 蜜桃av免费看| 黄色激情小视频| 宇都宫紫苑在线播放| 韩国黄色一级片| aa片在线观看视频在线播放| 亚洲一级中文字幕| 国产三级aaa| 丰满人妻一区二区三区免费视频棣| 无码国产精品久久一区免费| 中文在线观看免费视频| a毛片毛片av永久免费| 五月天婷婷丁香网| 日本中文字幕精品| 国产精品毛片一区二区| www.5588.com毛片| 大桥未久恸哭の女教师| 国产成人一区二区在线观看| 成人免费视频网站入口::| 岛国精品资源网站| 国产一区在线观看免费| 91视频在线免费| 国产免费美女视频| 日本免费福利视频| 无套白嫩进入乌克兰美女| www.色多多| 亚洲一区二区三区三州| 亚洲欧美va天堂人熟伦| 精品国产aⅴ一区二区三区东京热 久久久久99人妻一区二区三区 | 久久精品老司机| 成熟的女同志hd| 欧美黄色一级生活片| www.av成人| 国产精品成人在线视频| 中国一级特黄录像播放| www欧美com| 欧美偷拍一区二区三区| 国产真实乱人偷精品| 亚洲国产成人精品综合99| 中国女人特级毛片| 西西大胆午夜视频| 一区二区三区四区影院| 久久久精品视频免费观看| 国产又黄又粗又猛又爽的| 亚洲天堂网一区二区| 中文文字幕文字幕高清| 国产在线不卡av| 国产亚洲精品成人a| 日韩欧美综合视频| 亚洲欧洲综合网| 日本黄色录像视频| 国产大屁股喷水视频在线观看| 亚洲天堂视频一区| 日本xxx在线播放| 97人妻精品一区二区免费| 艳妇乳肉亭妇荡乳av| 国产女人18毛片水真多18| 男人网站在线观看| 中文字幕人妻一区二区三区| 国产女人18毛片水真多18| 伊人av在线播放| 无码人妻丰满熟妇啪啪网站| 男男一级淫片免费播放| 182在线视频| 成人黄色免费网址| 成年人免费视频播放| 国产精品久久久精品四季影院| av成人免费网站| 男人女人拔萝卜视频| 在线观看国产免费视频| 亚洲天堂视频一区| 国产欧美小视频| 韩国一区二区三区四区| 国产麻豆天美果冻无码视频| 微拍福利一区二区| 国产又黄又爽又无遮挡| 少妇精品无码一区二区三区| 亚洲人成人无码网www国产| 美女三级黄色片| 久久久国产精品久久久| 国产精品扒开腿做爽爽| 老湿机69福利| 久久无码人妻精品一区二区三区| 一级片久久久久| 国产吃瓜黑料一区二区| 五月婷六月丁香| 黄色av电影网站| 奇米网一区二区| 亚洲精品乱码久久久久久久| 精品国产大片大片大片| 青青草成人免费视频| 日韩女优一区二区| 久久精品一区二区免费播放 | 亚洲欧洲日韩综合| 日韩免费成人av| 午夜不卡久久精品无码免费| 啪啪一区二区三区| 性欧美成人播放77777| 中文字幕乱码在线人视频| 手机毛片在线观看| 男人的天堂影院| 日本中文字幕精品| 亚洲一级生活片| 2019男人天堂| 性の欲びの女javhd| 国产 中文 字幕 日韩 在线| 黑人巨大猛交丰满少妇| 四虎影院中文字幕| 黄色片网站在线播放| 午夜时刻免费入口| 欧美老熟妇乱大交xxxxx| 天堂www中文在线资源| 岛国大片在线免费观看| 国产稀缺精品盗摄盗拍| 国产黄色片在线| 自拍偷拍第9页| 美女网站视频色| 多男操一女视频| 中国毛片直接看| 日本老熟俱乐部h0930| 91视频综合网| 91精品人妻一区二区三区蜜桃2| 欧美熟妇另类久久久久久多毛| 人妻人人澡人人添人人爽| 蜜臀久久精品久久久用户群体| 精品视频第一页| 中文字幕avav| 女同性恋一区二区三区| 丝袜美腿中文字幕| 日本精品久久久久中文| 91狠狠综合久久久| 性一交一黄一片| 北岛玲一区二区| www.4hu95.com四虎| 国产三级国产精品国产国在线观看 | 亚洲精品理论片| 欧美激情久久久久久久| 99视频只有精品| 91精品人妻一区二区| 99成人在线观看| 国产麻豆xxxvideo实拍|