Figure 2 — F1 scores across models
What students build
Every student works toward a polished research artifact. Depending on the project, this may include a 4–6 page workshop-style paper, reproducible codebase, poster, technical report, preprint, or external submission. We do not guarantee publication, but we help strong projects reach a submission-ready standard.
Hallucination Rates in AI Tutoring
Abstract — We evaluate three language models on 1,200 tutoring prompts, measuring factual consistency and response quality.
3. Experiments
Baselines include GPT-4o-mini, Llama-3-8B, and Mistral-7B…
model = load_baseline()
return model.evaluate()
Figure 2 — F1 by model
Best F1: 0.82 — Llama-3-8B outperforms baseline by 12.4%
Research tracks
Students select a track aligned with their interests and technical background. Each track includes structured mentorship toward a tractable, publishable research question.
LLM Evaluation & AI Safety
Students evaluate language models on reasoning, hallucination, bias, safety, robustness, or domain-specific tasks.
Example projects
- Evaluating LLMs on high school science misconception detection
- Measuring hallucination rates in AI tutoring responses
- Comparing small open-source models on safety classification
AI for Education
Students build and evaluate tools for grading, tutoring, feedback, or learning analytics.
Example projects
- Detecting math reasoning errors in student explanations
- Evaluating LLM feedback quality on college essays
- Building a dataset of student misconceptions
ML for Health & Biology
Students use public datasets to study biomedical prediction, medical imaging, genomics, or molecular ML.
Example projects
- Predicting disease risk from tabular health data
- Classifying medical images with deep learning
- Benchmarking models for molecular property prediction
Computer Vision
Students work on image classification, detection, segmentation, generative models, or multimodal learning.
Example projects
- Detecting urban features from satellite imagery
- Benchmarking vision models on real-world distribution shifts
- Classifying plant disease from image datasets
ML for Economics & Finance
Students apply machine learning to public datasets in economics, finance, housing, labor markets, and information quality.
Example projects
- Predicting housing prices using public economic data
- Detecting financial misinformation with LLMs
- Modeling labor market trends using public datasets
AI for Climate & Social Good
Students use ML to study climate, public policy, environmental risk, and social-impact problems.
Example projects
- Predicting urban heat islands from satellite and census data
- Classifying disaster-related social media posts
- Forecasting air quality with public environmental data
How the program works
A structured 10–12 week research process, from question formulation to submission-ready artifacts.
Project matching and research question
Student is matched with a mentor and chooses a tractable research question.
Literature review and dataset setup
Student reads relevant papers, identifies baselines, and prepares data.
Experiments
Student runs models, evaluates results, and iterates.
Paper and poster
Student writes a workshop-style paper and creates a research poster.
Submission prep
Strong projects are prepared for preprint, workshop, student journal, or showcase submission.
Mentored by graduate researchers
Mentors include PhD students, postdocs, and graduate researchers in machine learning and related fields. Students are matched based on research interests, technical background, and project goals.
We are currently onboarding mentors from leading AI/ML research programs and labs. Mentor matching is based on project fit, not generic tutoring availability.
Mentor profiles coming soon.
Outcomes students can leave with
Research paper or technical report
Reproducible codebase
Research poster
Final presentation/demo
Mentor feedback
Submission-ready manuscript
External submission support
Stronger college/research profile
Disclaimer: Publication and workshop acceptance depend on project quality, venue fit, and reviewer decisions. We help students produce the strongest possible submission, but external outcomes are not guaranteed.
Who should apply
This program is for motivated high school students who are comfortable with Python or willing to learn quickly, curious about machine learning, and ready to commit several hours per week to a serious research project.
We look for students who want to go beyond tutorials and competitions — who are willing to read papers, ask hard questions, iterate on experiments, and take ownership of their work. Prior research or ML experience is helpful but not required; what matters most is motivation, follow-through, and a genuine interest in building something original.
Pricing & admissions
Tuition varies by program length, mentor fit, and project scope. Families receive pricing after the consultation.
Frequently asked questions
No. We help students produce submission-ready work and support strong projects through external submissions, but acceptances depend on venue fit, project quality, and reviewer decisions.
Some coding experience is helpful. We match students to projects based on their current level, technical background, and goals.
Mentors are PhD students, postdocs, and graduate researchers in machine learning and related fields.
A paper or technical report, codebase, poster, and final presentation. Strong projects may also be submitted externally.
The program typically runs 10–12 weeks.
Students are screened for motivation, technical readiness, and project fit.
Contact us
Questions about the program, admissions, or scheduling a parent/student consultation? Send us a message and we'll follow up.
Apply for the next cohort
Tell us about the student, their background, and research interests. We'll follow up to schedule a consultation.