This assignment serves as a screening test for our internship program. It is designed to evaluate your technical skills, problem-solving abilities, and creativity in the realm of Natural Language Processing (NLP), specifically in the biomedical domain. Your performance on this assignment will be a key factor in our selection process.
- Assignment version: 1.0
- Last updated: 12 September 2023
- Estimated time: 1-2 hours
- Deadline: 27th of September 2023
Welcome to your internship test assignment! 🚀 Your mission, should you choose to accept it, is to apply advanced NLP techniques on biomedical literature abstracts from PubMed. Your findings will provide invaluable insights that will help drive our groundbreaking research in the biomedical domain. 🧬🔬
"This README will self-destruct in 5 seconds". Just kidding, but do read on for your assignment details. Good luck, Agent! 🕵️♀️🕵️♂️
- Python 3.x
You can install the required packages using pip. We have included a requirements.txt
file for easier setup.
Run the following command to install all required packages:
!pip install -r requirements.txt
Fork this repository to your GitHub account to start your own project. Keep your repository private. Complete the assignment tasks listed below.
Upon completion, share your GitHub repository by sending its URL to us via email [email protected] and CC: [email protected].
For this task, you'll use the MEDLINE API to extract biomedical literature abstracts on a specific topic within the biomedical domain.
Example:
Suppose you are interested in "Cancer Immunotherapy". Your task would be to extract relevant abstracts that deal with this topic from PubMed. Explain why you've chosen this particular topic.
For this task, please refer to the data_extraction_starter.ipynb
template Jupyter notebook script provided in this repository.
For this part, apply advanced NLP techniques to derive valuable insights from the abstracts you've gathered. This is your chance to showcase your NLP skills.
Analyze the insights you've derived. Explain why the information is useful and how these insights can assist our research or influence decision-making.
Describe the NLP techniques you have employed and the reasoning behind your choices. Your decision-making process is critical for us.
What metrics have you considered to evaluate the methods you have implemented? Provide an in-depth discussion of your results based on these metrics.
Update this README file to include your methodology, your findings, and instructions on how to run your code.
Your assignment will be evaluated based on the following criteria:
Criteria | Description | Weightage (%) | Possible Scores |
---|---|---|---|
Code Quality and Organization | Is the code well-organized, documented, and clean? Are proper naming conventions followed? | 20 | 0-5 |
Data Extraction Accuracy | How well did you manage to extract relevant data? Is your chosen topic explained and justified? | 10 | 0-5 |
Innovation in NLP Techniques | Are the NLP techniques used advanced, original, and effective? Did you go beyond basic techniques? | 25 | 0-5 |
Interpretation and Usefulness of Insights | Are the insights derived valuable and clearly explained? Is there a strong justification for their utility in real-world applications? | 25 | 0-5 |
Methodology and Reasoning | Is there a well-documented rationale behind the choice of methods, NLP techniques, and evaluation metrics? | 10 | 0-5 |
Evaluation Metrics and Results | Are the evaluation metrics appropriate and well-defined? Are the results thoroughly analyzed and interpreted? | 5 | 0-5 |
Documentation and README | Is the README comprehensive, including instructions, methodology, and findings? | 5 | 0-5 |
Scoring guide for each criterion:
- 0: Did not meet expectations
- 1: Met some expectations but significant improvements are needed
- 2: Met expectations but some elements could be improved
- 3: Strongly met expectations, minor improvements may be needed
- 4: Exceeded expectations in some areas
- 5: Far exceeded all expectations, exceptional work
Aut viam inveniam aut faciam