Skip to content

JavidChaji/FUM-Information-Retrieval-Indexing-and-Retrieval-Models

Repository files navigation

Contributors Forks Stargazers Issues MIT License

LinkedIn

report-cleaning

Ferdowsi University of Mashhad Information Retrieval Indexing and Retrieval Models

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

preprocess text:

  1. Normalization
  2. Stemming
  3. Lemmatization
  4. Remove stop words
  5. Remove punctuations

TD-IDF:

  • the frequency of words to determine how relevant those words are to a given document.

Libraries:

  • pandas
  • numpy
  • json
  • ast
  • math
  • scipy
  • threading
  • hazm
  • sklearn
  • google

Built With

Technologies and Tools Utilized in this Project

  • Python
  • Jupyter
  • Numpy
  • Pandas
  • Scikit-Learn
  • SciPy

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

Javid Chaji - @JavidChaji - javid.chaji@gmail.com

Project Link: https://github.com/JavidChaji/FUM-Information-Retrieval-Indexing-and-Retrieval-Models

(back to top)

Acknowledgments

(back to top)