Introduction
Data science in 2026 is not just about Python and SQL — it is about knowing how to use AI as a force multiplier. Whether you are a senior machine learning engineer or a junior analyst staring down a 50-column dataset, the right AI prompt can turn a 3-hour task into a 15-minute one. The challenge is that most prompts people use are vague and generic, producing equally vague and generic results.
The difference between a weak prompt and a powerful one comes down to three things: context, constraints, and output format. Telling an AI "help me with EDA" is like asking a contractor to "build something nice." But telling it "I have a 120k-row e-commerce dataset with columns for user_id, session_duration, page_views, cart_abandonment (bool), and purchase_value — give me a step-by-step EDA plan in Python using pandas and seaborn, and flag potential data quality issues to check first" — that produces actionable output you can run today.
This guide collects the best AI prompts for data scientists across every major workflow: from raw data exploration and SQL generation to model documentation and executive-facing slide prep. Each prompt is designed to be used as a template — swap in your own dataset details, model names, and business context. By the end, you will have a reusable prompt library that covers 90% of your daily tasks.
1. Exploratory Data Analysis (EDA)
EDA is where every project starts, and it is also where data scientists lose the most time to repetitive boilerplate. AI is extraordinarily good at scaffolding EDA code, identifying what checks to run, and explaining what each output means. The key is to feed it enough detail about your dataset upfront so it does not produce a generic starter notebook.
Use this prompt when you are looking at a new dataset for the first time and want a structured, hypothesis-driven exploration plan rather than a random sequence of .describe() calls.
You are a senior data scientist. I have a dataset with [NUMBER] rows and [NUMBER] columns. The columns are: [LIST COLUMN NAMES AND TYPES]. My target variable is [TARGET COLUMN] and the business question I am trying to answer is: [BUSINESS QUESTION]. Please generate a comprehensive EDA plan in Python (pandas, matplotlib, seaborn) that includes: 1. Data quality checks (nulls, duplicates, outliers, data type mismatches) 2. Univariate analysis for each key feature 3. Bivariate analysis between the most important features and the target 4. Correlation analysis and multicollinearity detection 5. Distribution plots with interpretation notes 6. A summary of 3-5 hypotheses I should test based on the data structure Use clean, commented code. Flag any common data quality pitfalls specific to this type of dataset.
df.head(5).to_string() and df.dtypes.to_string() directly into the prompt. Giving the AI real column names and actual sample values dramatically improves the specificity of the code it generates.2. Feature Engineering
Feature engineering is equal parts domain knowledge and pattern recognition — which makes it an ideal use case for AI assistance. A well-structured prompt can surface transformation ideas you might have overlooked, suggest interaction terms worth testing, and flag high-cardinality categorical variables that need special handling.
This prompt works especially well when you have completed initial EDA and are now deciding which raw features to transform before training.
I am building a [CLASSIFICATION / REGRESSION] model to predict [TARGET VARIABLE] in the [INDUSTRY / DOMAIN] domain. My current features are: [LIST RAW FEATURES WITH DESCRIPTIONS]. Act as a senior ML engineer. Suggest a prioritized list of feature engineering steps including: 1. Numeric transformations (log, sqrt, binning, normalization) — explain when and why for each 2. Interaction features that are likely to be predictive given the domain 3. Encoding strategies for high-cardinality categorical variables 4. Time-based feature extraction if applicable (e.g., hour, day-of-week, lag features) 5. Features to consider dropping and why 6. Any domain-specific features common in [INDUSTRY] that I might be missing For each suggestion, provide a short Python snippet using scikit-learn or pandas. Rate each suggestion High / Medium / Low impact based on common patterns in this domain.
3. Model Selection
Choosing the right model involves tradeoffs between interpretability, latency, data size, and business requirements — and those tradeoffs look different for every project. AI can help you reason through this decision systematically rather than defaulting to "just try XGBoost on everything."
Use this prompt when you are entering the modeling phase and need a structured comparison of candidate algorithms given your specific constraints.
I need to select the best ML model for the following scenario: - Task: [CLASSIFICATION / REGRESSION / RANKING / CLUSTERING] - Dataset size: [NUMBER] rows, [NUMBER] features - Class balance: [BALANCED / IMBALANCED — ratio if known] - Key constraints: [e.g., model must be explainable to regulators, inference latency < 50ms, model must run on edge device] - Current baseline model: [BASELINE MODEL AND ITS METRICS] - Evaluation metric I care most about: [AUC / F1 / RMSE / etc.] Compare the following algorithms across these dimensions: predictive performance (expected), training time, inference speed, interpretability, sensitivity to hyperparameters, and suitability given my constraints. Algorithms to compare: Logistic Regression, Random Forest, XGBoost/LightGBM, Neural Network (MLP), and [ANY OTHER]. Recommend a primary model and a fallback. Provide starter scikit-learn / LightGBM code for the top recommendation with a basic hyperparameter search setup.
4. SQL Query Generation
SQL generation is one of the highest-ROI AI use cases in a data scientist's workflow. Complex multi-table joins, window functions, and aggregation logic that used to take 20 minutes to write from scratch can now be drafted in under 60 seconds — if you give the AI the right schema context.
The secret is to always include DDL or schema definitions. Without them, the AI invents column names and you spend more time correcting errors than you saved.
You are an expert SQL analyst working with [POSTGRES / BIGQUERY / SNOWFLAKE / MYSQL]. Write a query based on the following schema and business requirement. Schema: [PASTE TABLE DDL OR DESCRIBE TABLE OUTPUT HERE] Business requirement: [DESCRIBE WHAT YOU NEED IN PLAIN ENGLISH — e.g., "For each user who made at least 2 purchases in the last 90 days, calculate their average order value, the number of unique product categories they bought from, and the time between their first and most recent purchase. Rank them by lifetime value descending."] Requirements: - Use CTEs for readability, not nested subqueries - Add comments explaining each CTE - Handle NULLs explicitly - Optimize for [LARGE TABLES / READ PERFORMANCE] - Output column names should be snake_case and human-readable Also flag any edge cases or ambiguities in the requirement that could affect results.
5. Experiment Design
Designing a rigorous experiment is one of the most underrated skills in data science — and one of the most common sources of misleading results. From insufficient sample size to failure to account for network effects, there are dozens of ways an experiment can look valid but produce garbage conclusions.
Use this prompt when scoping a new test with your product or growth team. It forces all the critical design decisions to the surface before a single line of code is written.
Act as a senior experimentation scientist. I need to design a controlled experiment with the following parameters: - Product area: [e.g., checkout flow, email onboarding, search ranking] - Hypothesis: [STATE YOUR HYPOTHESIS — e.g., "Adding a progress bar to the checkout flow will increase purchase completion rate"] - Primary metric: [METRIC + current baseline value] - Secondary / guardrail metrics: [LIST METRICS YOU MUST NOT HARM] - Traffic available per day: [NUMBER OF USERS / SESSIONS] - Minimum detectable effect (MDE): [% relative or absolute change you care about] - Risk tolerance: [CONSERVATIVE / STANDARD — affects alpha and power choices] Please provide: 1. Recommended statistical test (t-test, z-test, Mann-Whitney, etc.) and justification 2. Required sample size calculation (show the formula and result) 3. Recommended experiment duration 4. Randomization strategy (user-level, session-level, geo-level) and which is best here 5. Potential threats to internal validity specific to this experiment 6. Pre-experiment checklist (AA test, SRM check, metric sanity checks) 7. Python code for the power analysis using scipy or statsmodels
6. A/B Test Analysis
Running an experiment is only half the work. Analyzing it correctly — handling imbalanced assignment, novelty effects, heterogeneous treatment effects, and communicating uncertainty to non-technical stakeholders — is where most teams cut corners. This prompt gets you a complete, rigorous analysis from raw experiment data.
I ran an A/B test and have the following results. Please conduct a full statistical analysis. Experiment summary: - Test name: [NAME] - Duration: [START DATE] to [END DATE] - Control group: [N users], [METRIC VALUE] - Treatment group: [N users], [METRIC VALUE] - Primary metric: [METRIC NAME — conversion rate / revenue per user / etc.] - Type of metric: [PROPORTION / CONTINUOUS] Please: 1. Run the appropriate statistical test and report the p-value, confidence interval, and effect size (Cohen's d or relative lift) 2. Check for sample ratio mismatch (SRM) and explain what it means if found 3. Calculate the practical significance (not just statistical significance) 4. Segment the results by [USER COHORT / DEVICE TYPE / GEOGRAPHY] if the data supports it 5. Identify any novelty effect risk given the experiment duration 6. Write a 3-sentence plain-English summary suitable for a product manager 7. Give a clear go / no-go / extend recommendation with reasoning Provide Python code using scipy.stats and pandas for all calculations, with visualizations (confidence interval plots, time-series of the metric over the experiment window).
7. Debugging ML Pipelines
ML pipeline bugs are uniquely insidious because they often fail silently — producing valid-looking outputs that are subtly wrong. Data leakage, preprocessing applied in the wrong order, train-test contamination, and silent NaN propagation are among the most common culprits. This prompt turns AI into a methodical debugging partner.
I am experiencing an issue with my ML pipeline. Act as a senior ML engineer and help me debug it. Problem description: [DESCRIBE THE SYMPTOM — e.g., "My validation AUC is 0.94 but production AUC dropped to 0.61 after deployment" OR "My model training loss is NaN after epoch 3" OR "My predictions are all the same value"] Pipeline overview: [DESCRIBE OR PASTE YOUR PIPELINE CODE / PSEUDOCODE] What I have already checked: [LIST WHAT YOU HAVE RULED OUT] Please: 1. List the 5 most likely root causes ranked by probability given the symptom I described 2. For each cause, provide a specific diagnostic check with code 3. Explain the fix for each cause if confirmed 4. Check my pipeline code (if pasted) for common antipatterns: data leakage, incorrect cross-validation, preprocessing order errors, target encoding leakage, or train/test contamination 5. Suggest monitoring checks to catch this class of issue automatically in future runs
8. Model Documentation
Model cards and technical documentation are increasingly required for regulatory compliance, internal model governance, and responsible AI practices — but they are time-consuming to write well. AI can draft 80% of a production-quality model card in minutes if you structure the prompt correctly.
Write a complete model card for the following ML model following the Google Model Cards format and EU AI Act documentation best practices. Model details: - Model name: [NAME] - Model type: [ALGORITHM] - Version: [VERSION] - Date: [DATE] - Use case: [DESCRIBE THE BUSINESS PROBLEM IT SOLVES] - Input features: [LIST KEY FEATURES] - Output: [PREDICTION TYPE AND FORMAT] - Training data: [DESCRIBE DATA SOURCE, DATE RANGE, SIZE] - Key evaluation metrics: [METRIC: VALUE for train / val / test] - Known limitations: [LIST ANY YOU ARE AWARE OF] - Intended users: [WHO WILL USE THIS MODEL] - Out-of-scope uses: [WHAT THIS MODEL SHOULD NOT BE USED FOR] Please write sections: Model Description, Intended Use, Evaluation, Training Data, Ethical Considerations, Caveats and Recommendations, and a Short Technical Summary. Format in Markdown. Flag any sections where you need more information to write accurately.
9. Stakeholder Presentation Prep
The ability to communicate complex findings clearly to non-technical audiences is the single biggest career accelerator for data scientists. You can have the most rigorous analysis in the company, but if the VP of Product cannot understand the recommendation in 90 seconds, it will not drive decisions. This prompt helps you translate your technical work into executive-ready narratives.
I need to present the results of a data science project to [AUDIENCE — e.g., VP of Product and CMO / Engineering leadership / Board of Directors]. They are not technical but are data-literate. I have [TIME LIMIT] minutes. Project summary: - What I analyzed: [BRIEF DESCRIPTION] - Key finding: [YOUR MAIN RESULT] - Business impact: [ESTIMATED REVENUE / COST / TIME IMPACT] - Recommended action: [WHAT YOU WANT THEM TO DO] - Main risks or caveats: [KEY UNCERTAINTIES] Please help me: 1. Write a 3-slide narrative structure (Problem → Finding → Recommendation) using the Pyramid Principle 2. Draft speaker notes for each slide (max 4 sentences per slide) 3. Suggest 2-3 data visualizations that will make the key finding immediately obvious (describe what type of chart and what it should show) 4. Anticipate the 5 most likely challenging questions from this audience and write strong, concise answers for each 5. Rewrite my key finding in one sentence that a CEO would find compelling — focused on business impact, not statistical methods
10. Writing Data Science Reports
Analytical reports sit between technical notebooks and executive presentations — they need enough rigor to stand up to peer review, but enough clarity to be read by a product manager without a statistics degree. This prompt generates a structured report template customized to your specific analysis type, saving you 1-2 hours of writing time per report.
Write a professional data science report on the following analysis. The audience is [TARGET AUDIENCE]. The report should be suitable for internal documentation and potential audit. Analysis type: [e.g., churn prediction model evaluation / market basket analysis / user segmentation] Business context: [2-3 sentences on why this was done] Methodology: [BRIEF DESCRIPTION OF METHODS USED] Key results: [PASTE YOUR METRICS, TABLES, OR FINDINGS] Conclusion: [WHAT YOU RECOMMEND] Structure the report with these sections: 1. Executive Summary (max 150 words — lead with the recommendation) 2. Background and Objectives 3. Data and Methodology 4. Results and Key Findings (use numbered findings) 5. Limitations and Assumptions 6. Recommendations (prioritized and actionable) 7. Next Steps Writing guidelines: - Use active voice - Define every metric on first use - Avoid jargon — replace technical terms with plain language equivalents in parentheses - Bold the single most important number in each section - End each section with a one-sentence "so what" takeaway
Find Your Next Data Science Role with Godle
Godle is the AI-native job search platform built for technical talent. Whether you are a junior analyst looking for your first industry role or a senior ML engineer targeting FAANG-level positions, Godle matches you to roles that actually fit — and helps you land them.
Frequently Asked Questions
Level Up Your Data Science Career
Godle matches you with data science roles that fit your skills, salary expectations, and career goals — then helps you land them with AI-powered interview prep and resume tailoring.
Find Data Science Jobs — Free