Machine learning Program needed : Word Sense Disambiguation(WSD) in Hindi language using Naive Bayes in python using Hindi Wordnet
$70-125 USD
Slutfört
Publicerad över sex år sedan
$70-125 USD
Betalning vid leverans
Need a working end-to-end project on Word Sense Disambiguation(WSD) in Hindi language using Naive Bayes in python using Hindi Wordnet..
Word sense disambiguation (WSD) is the ability to identify the ‘meaning’ or ‘sense’ of words in particular context in a computational manner.
The output should contain Precision , recall, F-measure, accuracy and confusion matrix.
Detailed requirement are given below :
----------------------------------------------------------------
Project Requirements
(1) Preprocessing Step-----Tokenization, stop word removal
(2) Feature extraction
(3) Build a model and apply Naive bayes algorithm
(4) Find the "sense of the ambiguous word with its correct meaning and also its sense id"
(5) The output should contain Precision , Recall , F-Measure , Confusion Matrix, ROC Curve , Accuracy along with meaningful print
(6) Do it for all "Hindi ambiguous word"
(7) There will be "Training data file" and "Test data file"
(8) Use Python latest version (Anaconda Python in Jupyter Environment) in Windows 10 (64-bit)
[login to view URL]
(9) Use Hindi WordNet
(10) Train and test for other hindi dataset also..
(11) Provide maximum comment in the code for better understanding of the project.
(12) Provide ReadMe file (How to setup the project)
(13) Also provide document that contains complete details of the Step by step procedure for the project (e.g. showing How the feature is extracted in the project, and how it is used in the Naive Bayes as a input etc. )
(14) Project must be Based on two Research papers with enhancement to include both "noun" and "verbs"
(i) Naive Bayes classifier for Hindi Word Sense Disambiguation by Satyendr Singh (University of Allahabad, Allahabad, India satyendr@[login to view URL] ),
Tanveer J. Siddiqui (University of Allahabad, Allahabad, India jktanveer@[login to view URL] ) and Sunil K. Sharma (BM., Irdia [login to view URL]@[login to view URL] )
(ii) Sense Annotated Hindi Corpus by Satyendr Singh (School of Engineering & Technology BML Munjal University Gurgaon, India satyendr@[login to view URL]) and
Tanveer J. Siddiqui (Department of Electronics & Communication University of Allahabad Allahabad, India [login to view URL]@[login to view URL])
I have attached two research papers in Hindi WSD task and a dataset (sample dataset is attached for the project. You can use the same data set or anything similar but only in Hindi is needed).
Program should be able to provide contextual meanings of the ambiguous word . Our working project must be similar to these papers.
Below are the versions which we would like to have the project in :
--------------------------------------------------------------------
Python: 3.6.1 |Anaconda 4.4.0 (64-bit)
scipy: 0.19.0
numpy: 1.13.3
matplotlib: 2.0.2
pandas: 0.20.1
sklearn: 0.18.1
Pls check the attachments for more details.
l can do this project
am 100% sure of this
Relevant Skills and Experience
C Programming, Java, Machine Learning, Python, Software Architecture
Proposed Milestones
$72 USD - milestone