美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇

代做CITS5508、代做 Python 語言程序

時間:2024-04-21  來源:  作者: 我要糾錯



CITS5508 Machine Learning Semester 1, 2024
Assignment 2
Assessed, worth 15%. Due: 8pm, Friday 03th May 2024
Discussion is encouraged, but all work must be done and submitted individually. This assign- ment has 21 tasks, from which 20 are assessed, which total 45 marks.
You will develop Python code for classification tasks. You will use Grid Search and cross- validation to find the optimal hyperparameters of the model and discuss and interpret the differ- ent decisions and their impact on the model’s performance and interpretability.
1 Submission
Your submission consists of two files. The first file is a report describing your analysis/results. Your analysis should provide the requested plots, tables and your reflections about the results. Each deliverable task is indicated as D and a number. Your report should be submitted as a “.PDF” file. Name your file as assig1 <student id>.pdf (where you should replace <student id> with your student ID).
The second file is your Python notebook with the code supporting your analysis/results. Your code should be submitted as assig1 <student id>.ipynb, the Jupyter notebook exten- sion.
Submit your files to LMS before the due date and time. You can submit them multiple times. Only the latest version will be marked. Your submission will follow the rules provided in LMS.
Important:
• You must submit the first part of your assignment as an electronic file in PDF format (do not send DOCX, ZIP or any other file format). Only PDF format is accepted, and any other file format will receive a zero mark.
• You should provide comments on your code.
• You must deliver parts one and two to have your assignment assessed. That is, your submission should contain your analysis and your Jupyter notebook with all coding, both with appropriate formatting.
• Bysubmittingyourassignment,youacknowledgeyouhavereadallinstructionsprovided in this document and LMS.
• There is a general FAQ section and a section in your LMS, Assignments - Assignment 2 - Updates, where you will find updates or clarifications about the tasks when necessary. It is your responsibility to check this page regularly.
1

• You will be assessed on your thinking and process, not only on your results. A perfect performance without demonstrating understanding what you have done won’t provide you marks.
• Your answer must be concise. A few sentences (2-5) should be enough to answer most of the open questions. You will be graded on thoughtfulness. If you are writing long answers, rethink what you are doing. Probably, it is the wrong path.
• Youcanaskinthelaborduringconsultationifyouneedclarificationabouttheassignment questions.
• You should be aware that some algorithms can take a while to run. A good approach to improving their speed in Python is to use the vectorised forms discussed in class. In this case, it is strongly recommended that you start your assignment soon to accommodate the computational time.
• For the functions and tasks that require a random procedure (e.g. splitting the data into 80% training and 20% validation set), you should set the seed of the random generator to the value “5508” or the one(s) specified in the question.
2 Dataset
In this assignment, you are asked to train a few decision tree classifiers on the Breast cancer wisconsin (diagnostic) dataset available on Scikit-Learn and compare their performances.
Description about this dataset can be found on the Scikit-Learn web page:
https://scikit-learn.org/stable/datasets/toy dataset.html#breast-cancer-wisconsin-diagnostic-dataset
There are two classes in the dataset:
• malignant (212 instances, class value 0) and
• benign (357 instances, class value 1). Follow the example code given on the web page
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load breast cancer.html#sklearn.datasets. load breast cancer
to read the dataset and separate it into a feature matrix and a class vector. Your feature matrix should have 569 (rows) × 30 (columns) and your class vector should have 569 elements.
In all asked implementations using Decision Trees, Random Forests or data splits (such as when using train test split()), you should set random state as specified for results reproducibil- ity. You should aim to round your results to the second decimal place.
2

3 Tasks
First inspections on the dataset and preprocessing
D1 2 marks
Re-order the columns in your feature matrix (or dataframe) based on the column name. Provide a scatter plot to inspect the relationship between the first 10 features in the dataset. Use different colours in the visualisation to show instances coming from each class. (Hint: use a grid plot.)
D2 2 marks
Provide a few comments about what you can observe from the scatter plot:
• •
• •
What can be observed regarding the relationship between these features?
Can you observe the presence of clusters of groups? How do they relate to the target variable?
Are there any instances that could be outliers?
Are there features that could be removed? Why or why not?
1 mark
Compute and show the correlation matrix where each cell contains the correlation coefficient between the corresponding pair of features (Hint: you may use the heatmap function from the seaborn package).
D4 1 mark
Do the correlation coefficients support your previous observations?
D5
In a data science project, it’s crucial not just to remove highly correlated features but to consider the context and implications of feature selection carefully. Blindly dropping features may lead to the loss of valuable information or unintended bias in the model’s performance. Here, for the assignment context, we will drop a few features to simplify the classification tasks and speed up the computational time. Create a code that drop the features: mean perimeter, mean radius, worst radius, worst perimeter and radius error. These are features with a linear corre- lation higher than 0.97 in magnitude with some other features kept in the data.
After this process, your data matrix should be updated accordingly and contain 25 features. Task D5 must be performed; otherwise, your order deliverable tasks will be incorrect. However, there are no marks for task D5.
Fitting a Decision Tree model with default hyperparameters
D6 3 marks
Fit a decision tree classifier using default hyperparameters using 80% of the data. Remember to set the random generator’s state to the value “5508” (for both the split and class). Use the trained classifier to perform predictions on the training and test sets. Provide the accuracy, precision and recall scores for both sets and the confusion matrix for the test set.
D3
3

D7 2 marks
Comment on these results. Do you think your classifier is overfitting? If so, why this is happen- ing? If not, why not?
D8 2 marks
Display the decision tree built from the training process (like the one shown in Figure 6.1 of the textbook for the iris dataset).
D9 2 marks
Study the tree diagram and comment on the following:
• How many levels resulted from the model?
• Did the diagram help you to confirm whether the classifier has an overfitting issue? • What can you observe from the leaves?
• Is this an interpretable model?
D10 3 marks
Repeat the data split another four times, each using 80% of the data to train the model and the remaining 20% for testing. For these splits, set the seed of the random state to the values “5509”, “5510”, “5511” and “5512”. The random state of the model can be kept at “5508”.
For each of these four splits, fit the decision tree classifier and use the trained classifier to perform predictions on the test set. Provide three plots to show the accuracy, precision and recall scores for the test set for each split and comment on the consistency of the results in the five splits (including the original split with random state “5508”).
D11 3 marks
Investigate the impact of the training size on the performance of the model. You will do five different splits: 50%-50% (training the model on 50% of the data and testing on the remaining 50%), 60%-40%, 70%-30%, 80%-20% and 90%-10%. For each of these data splits, set back the seed of the random state to the value “5508”.
Provide three plots to show the accuracy, precision and recall scores for the test set for each data split and comment on the results. Did the performance behave as you expected?
Fitting a Decision Tree model with optimal hyperparameters
D12 4 marks
Create a training set using 80% of the data and a test set with the remaining 20%. Use a 10-fold cross-validation and grid-search to find the optimal combination of hyperparameters max depth — using values [2, 3, 4, 5], min samples split — using values [2, 4, 5, 10],andminsamplesleaf—usingvalues[2, 5]ofadecisiontreemodel.Remembertoset the seed of the random state of the data split function and model class to the value “5508”. For the cross-validation, set the value of the random state to “42”. Use accuracy for the scoring argument of the grid-search function.
With the optimal obtained hyperparameters, retrain the model and report: 4

• The optimal hyperparameters;
• The obtained accuracy, precision and recall on the training set;
• The obtained accuracy, precision and recall on the test set;
• The confusion matrix on the test set.
D13
2 marks
Comment: What was the impact of fine-tuning the hyperparameters as opposed to what you obtained in D6? Has fine-tuning done what you expected?
D14 3 marks Repeat the training of task D12 twice: one considering the scoring argument of the grid-search function as precision and the other recall.
For each of the scoring options (accuracy, precision, recall), provide the optimal hyperparame- ters according to the 10-fold cross-validation and grid-search, and, after retraining each model accordingly, provide the confusion matrix on the test set. Comment on the results, considering the problem.
Fitting a Decision Tree with optimal hyperparameters and a reduced feature set
D15 1 mark
Using the model with fine-tuned hyperparameters based on accuracy (the one you obtained in D12), display the feature importance for each feature obtained from the training process. You should sort the feature importances in descending order.
D16 3 marks
Using the feature importance you calculated in the previous task, trim the feature dimension of the data. That is, you should retain only those features whose importance values are above 1% (i.e., 0.01). You can either write your own Python code or use the function SelectFromModel from the sklearn.feature selection package to work out which feature(s) can be removed.
Report what features were retained and removed in the above process. Also report the total feature importance value that is retained after your dimension reduction step.
D17 3 marks
Compare the model’s performance (accuracy, precision, recall) on training and test sets when using the reduced set of features and the model trained on the complete set of features. Also, report the corresponding confusion matrices on the test sets. (You will need to consider whether you should repeat the cross-validation process to find the optimal hyperparameters).
D18 1 mark
Comment on your results. What was the impact (if any) of reducing the number of features?
5

Fitting a Random Forest
D19 3 marks
Considering all features and the 80%-20% data split you did before, use 10-fold cross-validation and grid-search to find a Random Forest classifier’s optimal hyperparameters n estimators (number of estimators) and max depth. Remember to set the seed of the random state of the data split function and model class to the value “5508”. Use n estimators:[10, 20, 50, 100, 1000], and max depth:[2, 3, 4, 5]. For the cross-validation, set the value of the random state to “42”. Use accuracy for the scoring argument of the grid-search function.
Keep the other hyperparameter values to their default values. Use the optimal values for the n estimators and max depth hyperparameters to retrain the model and report:
• The obtained optimal number of estimators and max depth;
• The obtained accuracy, precision and recall on the training set;
• The obtained accuracy, precision and recall on the test set;
• The confusion matrix on the test set.
D20
How do these performances compare with the ones you obtained in D12? What changed with the use of a Random Forest model? Is this result what you would expect?
D21 2 marks
Thinking about the application and the different models you created, discuss:
• Do you think these models are good enough and can be trusted to be used for real?
• Do you think a more complex model is necessary?
• Do you think using a machine learning algorithm for this task is a good idea? That is, should this decision process be automated? Justify.
• Are there considerations with the used dataset?


請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp

標簽:

掃一掃在手機打開當前頁
  • 上一篇:代做CPT206、c/c++,Python程序設計代寫
  • 下一篇:FIT5225 代做、代寫 java,c++語言程序
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    美女扒开腿免费视频_蜜桃传媒一区二区亚洲av_先锋影音av在线_少妇一级淫片免费放播放_日本泡妞xxxx免费视频软件_一色道久久88加勒比一_熟女少妇一区二区三区_老司机免费视频_潘金莲一级黄色片_精品国产精品国产精品_黑人巨大猛交丰满少妇
    日韩av片在线免费观看| 日本伦理一区二区三区| 欧美h片在线观看| 阿v天堂2014| 免费看91的网站| 特级西西人体高清大胆| 亚洲a∨无码无在线观看| 波多野结衣家庭教师| 91精产国品一二三| 中文文字幕文字幕高清| 人人人妻人人澡人人爽欧美一区| 亚洲黄色免费视频| 9999热视频| 懂色av粉嫩av蜜乳av| b站大片免费直播| 91嫩草丨国产丨精品| 青青草视频网站| www.4hu95.com四虎| 无码人妻丰满熟妇区毛片蜜桃精品| 人妻体内射精一区二区三区| 日本一二三不卡视频| 女王人厕视频2ⅴk| 亚洲精品国产成人av在线| 不许穿内裤随时挨c调教h苏绵| 丰满熟女人妻一区二区三区| 极品粉嫩小仙女高潮喷水久久| 国产免费嫩草影院| 国产精品偷伦视频免费观看了 | gogo亚洲国模私拍人体| 欧美成人精品一区二区综合免费| 日韩中文字幕有码| 精品国产aⅴ一区二区三区东京热 久久久久99人妻一区二区三区 | 国产探花视频在线| 深夜视频在线观看| 国产一区第一页| 大地资源二中文在线影视观看| 美女的奶胸大爽爽大片| 91网站免费视频| www.久久av| 性欧美丰满熟妇xxxx性久久久| 欧美激情图片小说| 中文字幕第二区| 一区二区三区伦理片| 污网站免费观看| 精品伦一区二区三区| 亚洲国产综合av| 极品人妻一区二区| 日本青青草视频| 亚洲xxxx3d动漫| 日批视频在线看| 在线观看你懂的视频| 久久99久久99精品免费看小说| 播金莲一级淫片aaaaaaa| av网在线播放| 1024手机在线观看你懂的| 亚洲欧美高清在线| 久草视频福利在线| 涩视频在线观看| 最新日本中文字幕| 国产人妖在线观看| 色婷婷狠狠18禁久久| 加勒比婷婷色综合久久| 全程偷拍露脸中年夫妇| 黄页网站在线看| 亚洲图片欧美另类| 日本精品一二三区| 加勒比精品视频| 成人免费无遮挡无码黄漫视频| 色无极影院亚洲| 日韩欧美黄色网址| 国产高潮失禁喷水爽到抽搐| 亚洲成a人片在线www| 无码av免费精品一区二区三区| 国产一级黄色录像| 亚洲黄色免费视频| 久久久久无码精品| 国精品无码人妻一区二区三区| 成人免费视频入口| 四虎成人免费视频| 欧美xxxx精品| 天天躁日日躁狠狠躁免费麻豆| 久久久久无码精品国产sm果冻| 波多野结衣爱爱视频| 亚洲天堂成人av| 手机在线免费看片| 日本性高潮视频| 中文字幕无人区二| 亚洲精品国产精品乱码在线观看| 精品视频站长推荐| 91免费公开视频| 人妻av无码一区二区三区| 国产这里有精品| 91大神福利视频| 欧美大波大乳巨大乳| 中国男女全黄大片| 亚洲色偷偷综合亚洲av伊人| 免费在线观看你懂的| 日韩黄色一区二区| 九九热视频在线免费观看| 亚洲国产精品成人综合久久久| 极品盗摄国产盗摄合集| 中文字幕黄色网址| 人妻一区二区视频| 91网站免费视频| 国产一级伦理片| 怡红院一区二区| 中文字幕99页| 黑人玩弄人妻一区二区三区| 日日噜噜夜夜狠狠久久波多野| 亚洲熟女毛茸茸| 极品尤物一区二区| 午夜精产品一区二区在线观看的| 2018国产精品| 动漫美女无遮挡免费| 人妻换人妻仑乱| 激情小说欧美色图| 亚洲视频 中文字幕| zjzjzjzjzj亚洲女人| 精品人妻人人做人人爽夜夜爽| 999精品在线视频| 青青草原在线免费观看| 人妻少妇精品一区二区三区| 能看毛片的网站| 小毛片在线观看| 亚洲少妇一区二区| 少妇愉情理伦三级| 日本欧美一区二区三区不卡视频| wwwwxxxx国产| 欧美三级视频网站| 全网免费在线播放视频入口| 裸体武打性艳史| 久久av一区二区三| 国产精品久久无码| 自拍偷拍第9页| 91丨porny丨对白| 青青草福利视频| 永久久久久久久| 国产又粗又长又爽| 日韩一级片在线免费观看| 国产综合精品在线| 好吊操视频这里只有精品| 亚洲黄色在线网站| 国产免费一区二区三区四区| 日韩精品xxx| 成人午夜免费影院| 日本黄色动态图| 青青青在线免费观看| 国产三级视频网站| 美女又黄又免费的视频| 六月婷婷七月丁香| 涩视频在线观看| 成年人网站免费看| 国产男女无遮挡猛进猛出| 国产免费无遮挡吸奶头视频 | 少妇按摩一区二区三区| 青娱乐国产视频| 免费看黄色aaaaaa 片| 熟女av一区二区| 中国女人特级毛片| 日本不卡视频一区| www.毛片com| 9.1片黄在线观看| 美国黄色一级毛片| 岛国av免费观看| 91在线播放观看| 91免费公开视频| а天堂中文在线资源| 99热6这里只有精品| 黄色短视频在线观看| 久久黄色一级视频| 小向美奈子av| 日本精品在线免费观看| 变态另类ts人妖一区二区| 久久久久久无码精品人妻一区二区| 欧美成人另类视频| 亚洲一区二区自偷自拍| 国产肥白大熟妇bbbb视频| 北岛玲一区二区| 国产伦精品一区三区精东| 午夜性福利视频| av在线播放网址| 国产麻豆天美果冻无码视频| 亚洲狠狠婷婷综合久久久久图片| 日韩Av无码精品| 魔女鞋交玉足榨精调教| 国产男女猛烈无遮挡a片漫画| 一本色道综合久久欧美日韩精品| 无码人妻精品一区二区三区温州| 成人做爰69片免费| 亚洲一区二区在线免费| 精品少妇人妻一区二区黑料社区| 久久国产精品影院| 精品国产午夜福利在线观看| 国产免费嫩草影院| 免费看91的网站| 国产3级在线观看| 亚洲天堂一级片| 澳门黄色一级片| 欧美 日本 国产| 999福利视频|