美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇

COM6511代寫、Python語言編程代做

時(shí)間:2024-05-09  來源:  作者: 我要糾錯(cuò)



COM4511/COM6511 Speech Technology - Practical Exercise -
Keyword Search
Anton Ragni
Note that for any module assignment full marks will only be obtained for outstanding performance that
goes well beyond the questions asked. The marks allocated for each assignment are 20%. The marks will be
assigned according to the following general criteria. For every assignment handed in:
1. Fulfilling the basic requirements (5%)
Full marks will be given to fulfilling the work as described, in source code and results given.
2. Submitting high quality documentation (5%)
Full marks will be given to a write-up that is at the highest standard of technical writing and illustration.
3. Showing good reasoning (5%) Full marks will be given if the experiments and the outcomes are explained to the best standard.
4. Going beyond what was asked (5%)
Full marks will be given for interesting ideas on how to extend work that are well motivated and
described.
1 Background
The aim of this task is to build and investigate the simplest form of a keyword search (KWS) system allowing to find information
in large volumes of spoken data. Figure below shows an example of a typical KWS system which consists of an index and
a search module. The index provides a compact representation of spoken data. Given a set of keywords, the search module
Search Results
Index
Key− words
queries the index to retrieve all possible occurrences ranked according to likelihood. The quality of a KWS is assessed based
on how accurately it can retrieve all true occurrences of keywords.
A number of index representations have been proposed and examined for KWS. Most popular representations are derived
from the output of an automatic speech recognition (ASR) system. Various forms of output have been examined. These differ
in terms of the amount of information retained regarding the content of spoken data. The simplest form is the most likely word
sequence or 1-best. Additional information such as start and end times, and recognition confidence may also be provided for
each word. Given a collection of 1-best sequences, the following index can be constructed
w1 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
w2 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
.
.
.
wN (fN,1, sN,1, eN,1) . . . (fN,nN , sN,nN , eN,nN )
(1)
1
where wi is a word, ni is the number of times word wi occurs, fi,j is a file where word wi occurs for the j-th time, si,j and ei,j
is the start and end time. Searching such index for single word keywords can be as simple as finding the correct row (e.g. k)
and returning all possible tuples (fk,1, sk,1, ek,1), . . ., (fk,nk , sk,nk , ek,nk ).
The search module is expected to retrieve all possible keyword occurrences. If ASR makes no mistakes such module
can be created rather trivially. To account for possible retrieval errors, the search module provides each potential occurrence
with a relevance score. Relevance scores reflect confidence in a given occurrence being relevant. Occurrences with extremely
low relevance scores may be eliminated. If these scores are accurate each eliminated occurrence will decrease the number of
false alarms. If not then the number of misses will increase. What exactly an extremely low score is may not be very easy
to determine. Multiple factors may affect a relevance score: confidence score, duration, word confusability, word context,
keyword length. Therefore, simple relevance scores, such as those based on confidence scores, may have a wide dynamic range
and may be incomparable across different keywords. In order to ensure that relevance scores are comparable among different
keywords they need to be calibrated. A simple calibration scheme is called sum-to-one (STO) normalisation
rˆi,j = r
γ
 
i,j
ni
k=1 r
γ
i,k
(2)
where ri,j is an original relevance score for the j-th occurrence of the i-th keyword, γ is a scale enabling to either sharpen or
flatten the distribution of relevance scores. More complex schemes have also been examined. Given a set of occurrences with
associated relevance scores, there are several options available for eliminating spurious occurrences. One popular approach
is thresholding. Given a global or keyword specific threshold any occurrence falling under is eliminated. Simple calibration
schemes such as STO require thresholds to be estimated on a development set and adjusted to different collection sizes. More
complex approaches such as Keyword Specific Thresholding (KST) yield a fixed threshold across different keywords and
collection sizes.
Accuracy of KWS systems can be assessed in multiple ways. Standard approaches include precision (proportion of relevant retrieved occurrences among all retrieved occurrences) and recall (proportion of relevant retrieved occurrences among all
relevant occurrences), mean average precision and term weighted value. A collection of precision and recall values computed
for different thresholds yields a precision-recall (PR) curve. The area under PR curve (AUC) provides a threshold independent summative statistics for comparing different retrieval approaches. The mean average precision (mAP) is another popular,
threshold-independent, precision based metric. Consider a KWS system returning 3 correct and 4 incorrect occurrences arranged according to relevance score as follows: ✓ , ✗ , ✗ , ✓ , ✓ , ✗ , ✗ , where ✓ stands for correct occurrence and ✗ stands
for incorrect occurrence. The average precision at each rank (from 1 to 7) is 1
1 , 0
2 , 0
3 , 2
4 , 3
5 , 0
6 , 0
7 . If the number of true correct
occurrences is 3, the mean average precision for this keyword 0.7. A collection-level mAP can be computed by averaging
keyword specific mAPs. Once a KWS system operates at a reasonable AUC or mAP level it is possible to use term weighted
value (TWV) to assess accuracy of thresholding. The TWV is defined by
TWV(K, θ) = 1 −
 
1
|K|
 
k∈K
Pmiss(k, θ) + βPfa(k, θ)
 
(3)
where k ∈ K is a keyword, Pmiss and Pfa are probabilities of miss and false alarm, β is a penalty assigned to false alarms.
These probabilities can be computed by
Pmiss(k, θ) = Nmiss(k, θ)
Ncorrect(k) (4)
Pfa(k, θ) = Nfa(k, θ)
Ntrial(k) (5)
where N<event> is a number of events. The number of trials is given by
Ntrial(k) = T − Ncorrect(k) (6)
where T is the duration of speech in seconds.
2 Objective
Given a collection of 1-bests, write a code that retrieves all possible occurrences of keyword list provided. Describe the search
process including index format, handling of multi-word keywords, criterion for matching, relevance score calibration and
threshold setting methodology. Write a code to assess retrieval performance using reference transcriptions according to AUC,
mAP and TWV criteria using β = 20. Comment on the difference between these criteria including the impact of parameter β.
Start and end times of hypothesised occurrences must be within 0.5 seconds of true occurrences to be considered for matching.
2
3 Marking scheme
Two critical elements are assessed: retrieval (65%) and assessment (35%). Note: Even if you cannot complete this task as a
whole you can certainly provide a description of what you were planning to accomplish.
1. Retrieval
1.1 Index Write a code that can take provided CTM files (and any other file you deem relevant) and create indices in
your own format. For example, if Python language is used then the execution of your code may look like
python index.py dev.ctm dev.index
where dev.ctm is an CTM file and dev.index is an index.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.2 Search Write a code that can take the provided keyword file and index file (and any other file you deem relevant)
and produce a list of occurrences for each provided keyword. For example, if Python language is used then the
execution of your code may look like
python search.py dev.index keywords dev.occ
where dev.index is an index, keywords is a list of keywords, dev.occ is a list of occurrences for each
keyword.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.3 Description Provide a technical description of the following elements
• Index file format
• Handling multi-word keywords
• Criterion for matching keywords to possible occurrences
• Search process
• Score calibration
• Threshold setting
2. Assessment Write a code that can take the provided keyword file, the list of found keyword occurrences and the corresponding reference transcript file in STM format and compute the metrics described in the Background section. For
instance, if Python language is used then the execution of your code may look like
python <metric>.py keywords dev.occ dev.stm
where <metric> is one of precision-recall, mAP and TWV, keywords is the provided keyword file, dev.occ is the
list of found keyword occurrences and dev.stm is the reference transcript file.
Hint: In order to simplify assessment consider converting reference transcript from STM file format to CTM file format.
Using indexing and search code above obtain a list of true occurrences. The list of found keyword occurrences then can
be assessed more easily by comparing it with the list of true occurrences rather than the reference transcript file in STM
file format.
2.1 Implementation
• AUC Integrate an existing implementation of AUC computation into your code. For example, for Python
language such implementation is available in sklearn package.
• mAP Write your own implementation or integrate any freely available.
3
• TWV Write your own implementation or integrate any freely available.
2.2 Description
• AUC Plot precision-recall curve. Report AUC value . Discuss performance in the high precision and low
recall area. Discuss performance in the high recall and low precision area. Suggest which keyword search
applications might be interested in a good performance specifically in those two areas (either high precision
and low recall, or high recall and low precision).
• mAP Report mAP value. Report mAP value for each keyword length (1-word, 2-words, etc.). Compare and
discuss differences in mAP values.
• TWV Report TWV value. Report TWV value for each keyword length (1-word, 2-word, etc.). Compare and
discuss differences in TWV values. Plot TWV values for a range of threshold values. Report maximum TWV
value or MTWV. Report actual TWV value or ATWV obtained with a method used for threshold selection.
• Comparison Describe the use of AUC, mAP and TWV in the development of your KWS approach. Compare
these metrics and discuss their advantages and disadvantages.
4 Hand-in procedure
All outcomes, however complete, are to be submitted jointly in a form of a package file (zip/tar/gzip) that includes
directories for each task which contain the associated required files. Submission will be performed via MOLE.
5 Resources
Three resources are provided for this task:
• 1-best transcripts in NIST CTM file format (dev.ctm,eval.ctm). The CTM file format consists of multiple records
of the following form
<F> <H> <T> <D> <W> <C>
where <F> is an audio file name, <H> is a channel, <T> is a start time in seconds, <D> is a duration in seconds, <W> is a
word, <C> is a confidence score. Each record corresponds to one recognised word. Any blank lines or lines starting with
;; are ignored. An excerpt from a CTM file is shown below
7654 A 11.34 0.2 YES 0.5
7654 A 12.00 0.34 YOU 0.7
7654 A 13.30 0.5 CAN 0.1
• Reference transcript in NIST STM file format (dev.stm, eval.stm). The STM file format consists of multiple records
of the following form
<F> <H> <S> <T> <E> <L> <W>...<W>
where <S> is a speaker, <E> is an end time, <L> topic, <W>...<W> is a word sequence. Each record corresponds to
one manually transcribed segment of audio file. An excerpt from a STM file is shown below
2345 A 2345-a 0.10 2.03 <soap> uh huh yes i thought
2345 A 2345-b 2.10 3.04 <soap> dog walking is a very
2345 A 2345-a 3.50 4.59 <soap> yes but it’s worth it
Note that exact start and end times for each word are not available. Use uniform segmentation as an approximation. The
duration of speech in dev.stm and eval.stm is estimated to be 57474.2 and 25694.3 seconds.
• Keyword list keywords. Each keyword contains one or more words as shown below
請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp




















 

標(biāo)簽:

掃一掃在手機(jī)打開當(dāng)前頁
  • 上一篇:EBU6304代寫、Java編程設(shè)計(jì)代做
  • 下一篇:COM4511代做、代寫Python設(shè)計(jì)編程
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級(jí)風(fēng)景名勝區(qū)
    昆明西山國家級(jí)風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗(yàn)證碼平臺(tái) 理財(cái) WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇
    免费人成又黄又爽又色| 久久午夜精品视频| 韩国一级黄色录像| 欧美自拍偷拍网| 成人一级片免费看| 日韩国产第一页| 自拍视频第一页| 国产又粗又猛又爽又黄av| 50一60岁老妇女毛片| av2014天堂网| 国产成人精品无码免费看夜聊软件| 成人免费毛片日本片视频| 美女又爽又黄免费| 黄色精品视频在线观看| av在线免费观看不卡| 精品无码一区二区三区| 国精产品一区一区二区三区mba| 农村黄色一级片| 人妻互换一区二区激情偷拍| 国产乱子轮xxx农村| 大地资源二中文在线影视观看| 自拍偷拍亚洲天堂| 少妇视频一区二区| 无码人妻精品一区二区中文| 影音先锋资源av| 亚洲av无码久久精品色欲| 国产性生活大片| 老司机成人免费视频| 毛片aaaaaa| 受虐m奴xxx在线观看| 四虎永久免费影院| 老司机精品免费视频| 久久久久久久久久久久久久久| 国产精品无码电影| 国产精品成人一区二区三区电影毛片| 中国老熟女重囗味hdxx| 精人妻一区二区三区| 李宗瑞91在线正在播放| 亚洲精品国产精品国自| 亚洲最大的黄色网| 成年人网站免费看| 日本黄色录像片| 亚洲精品国产精品国自| 香蕉视频污视频| 男人的天堂免费| 欧洲猛交xxxx乱大交3| bl动漫在线观看| 成人在线观看免费完整| 国产精品久久久久无码av色戒| av网站免费在线看| 黑人性生活视频| 西西444www无码大胆| 国精品人伦一区二区三区蜜桃| fc2成人免费视频| 成人欧美一区二区三区黑人一| 波多野结衣亚洲一区二区| 国产日韩欧美在线观看视频| 日本黄色大片在线观看| 国产极品视频在线观看| 午夜诱惑痒痒网| japanese中文字幕| 国内偷拍精品视频| 亚洲AV无码成人精品区明星换面 | wwwwww日本| 北京富婆泄欲对白| 一级黄色片毛片| 久久精品国产亚洲av麻豆| 给我看免费高清在线观看| 国产免费看av| 国产精品成人在线视频| 国产成人自拍网站| 亚洲丝袜在线观看| 亚洲av成人片色在线观看高潮 | 久久国产高清视频| 日韩福利在线视频| 亚洲精品国产精品乱码在线观看| 受虐m奴xxx在线观看| 国模无码视频一区| 黄色正能量网站| 亚洲一二三四五六区| 免费在线黄色网| 中文字幕1区2区| 中字幕一区二区三区乱码| 99热这里只有精品4| 久久r这里只有精品| 国产精品一区二区入口九绯色| 免费污网站在线观看| 国产一卡二卡三卡四卡| 肉色超薄丝袜脚交69xx图片 | 日韩a级片在线观看| 国产十八熟妇av成人一区| gv天堂gv无码男同在线观看| 一级特级黄色片| 2018国产精品| 日本精品一二三区| 丰满少妇中文字幕| 快灬快灬一下爽蜜桃在线观看| 欲求不满的岳中文字幕| 国产精品日日摸夜夜爽| 在线观看一区二区三区四区| 色欲无码人妻久久精品| 一本色道久久hezyo无码| 逼特逼视频在线观看| 一级在线观看视频| 欧美福利在线视频| 秋霞午夜鲁丝一区二区| 无套白嫩进入乌克兰美女| 国精产品一区一区二区三区mba| 久草福利资源在线| 青青草精品在线| 污污污www精品国产网站| 亚洲第一页av| 亚洲成人生活片| 亚洲精品中文字幕在线播放| 国产在线观看无码免费视频| 亚洲精品成人无码| 91精品国自产在线偷拍蜜桃| 中文字幕制服丝袜| 日韩av片在线| 欧美18—19性高清hd4k| 中文字幕精品视频在线| 天天天天天天天天操| 日本性高潮视频| 摸摸摸bbb毛毛毛片| 漂亮人妻被黑人久久精品| 人与动物性xxxx| 无码少妇精品一区二区免费动态| 成人在线短视频| 美国黄色小视频| 老熟妇高潮一区二区三区| 欧美精品日韩在线| 99精品全国免费观看| 51调教丨国产调教视频| 国产激情第一页| 亚洲久久久久久| 亚洲一级Av无码毛片久久精品| 国产精品嫩草69影院| 久久人妻少妇嫩草av蜜桃| 国产伦精品一区二区三区精品| 久久精品aⅴ无码中文字字幕重口| 日本不卡视频一区| 纪美影视在线观看电视版使用方法| 羞羞在线观看视频| 麻豆av免费看| 800av在线播放| 97在线观看免费高| 国产十八熟妇av成人一区| 中文字幕无码人妻少妇免费| 在线不卡av电影| 女同性αv亚洲女同志| 日韩人妻无码精品综合区| 青花影视在线观看免费高清| 国产黄色一区二区三区| 30一40一50老女人毛片| 日韩福利小视频| a毛片毛片av永久免费| 国产精品无码网站| 欧美熟妇精品一区二区蜜桃视频| 日韩av网站在线播放| www久久久久久久| 非洲一级黄色片| 亚洲中文字幕无码av| 午夜视频在线免费看| 在线观看成人毛片| 97免费公开视频| 国产成人av免费观看| 中文字幕第3页| 人妻少妇一区二区| av电影网站在线观看| 久草福利资源在线| 妖精视频一区二区| 又色又爽的视频| 黄色片子免费看| 制服丝袜av在线| 无码人妻丰满熟妇啪啪欧美| 久久免费手机视频| 一区二区视频免费看| 国产精品一区二区入口九绯色| 伊人网在线视频观看| 久久久久亚洲av无码专区体验| 69夜色精品国产69乱| 欧美一级片黄色| 中文字幕影音先锋| 波多野结衣家庭教师在线观看| 538任你躁在线精品视频网站| 性欧美精品中出| 免费不卡的av| 看黄色录像一级片| 少妇精品一区二区三区| 日本少妇激三级做爰在线| 强伦人妻一区二区三区| 无码成人精品区在线观看| 丝袜美腿小色网| 午夜国产小视频| 中国1级黄色片| 国产亚洲精品久久久久久豆腐| 无码精品一区二区三区在线播放| 欧洲第一无人区观看| 欧美一级特黄高清视频| 欧美一区二区三区观看|