Recently, a student I led successfully completed the Intuit Interview process for Data Scientist positions. From the overall experience, the biggest feeling is that Intuit's interviews are extremely focused on practical abilities and run through strong product-oriented thinking. They do not simply examine theoretical knowledge, but also value candidates' ability to solve practical business problems. The following is a detailed breakdown of the entire interview process, core test points and response ideas for your reference when preparing for the exam.
Interview process
The entire process lasted for nearly 6 weeks, and the pace was not fast, but each round had a clear screening purpose.
- Recruiter phone screening
- Take-home assignments
- Technical Screen (Karat Platform)
- Virtual Onsite (5 rounds)
SQL
Covered window functions, CTE, TurboTax/QuickBooks user cohort analysis.
Question 1
Calculate monthly retention rate for tax preparer users Given a user tax return form containing user_id, file_date, product (TurboTax or QuickBooks). Calculate the proportion of users in January who file taxes again in each subsequent month.
Problem-solving ideas
First define the cohort month, and use the users who filed taxes for the first time in January as the cohort; then generate a continuous month sequence starting from January to ensure that there are observation points every month; then make a left connection between the cohort users and their subsequent tax filing behavior by user and month to determine whether the user has a tax filing record in each month; finally, average the active marks of each month to obtain the retention rate of the January cohort in each subsequent month.
Question 2
Session question table structure: user_id, timestamp, page_view definition: Continuous behavior within 30 minutes is counted as one session to calculate the number of sessions for each user.
Problem-solving ideas
First sort each behavior record by user and time, then Timestamp Use LAG Calculate the time difference from the previous behavior, and mark it as a new session when the time difference is greater than 30 minutes; then do a cumulative sum of the marks of this new session in the user dimension, assign a session_id to each behavior, and finally count the number of different session_ids by user to get the number of sessions for each user.
Question 3
Revenue accumulation and transaction table by customer: customer_id, date, amount displays the cumulative revenue of each customer over time.
Problem-solving ideas
This type of problem does not require additional grouping or self-joining, as long as the time is sorted in the customer dimension. The method is to press Customer_id Partition, sort by transaction date, and then use a window function to accumulate all transaction amounts before the current row. The window starts with the first transaction (unbounded preceding) and accumulates to the current row, so that each row represents the cumulative revenue for that customer up to that point in time.
Reference SQL
SELECT
customer_id,
date,
amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_revenue
FROM transactions
ORDER BY customer_id, date;
Python
Focus on Pandas rather than LeetCode style algorithms.
Core code:
Merge user data from multiple sources (with inconsistent IDs): use Pd.merge(how=outer), handle missing values
Time series resampling: set datetime index, Resample(D).mean()
GroupBy + custom aggregation: top N users with the highest monthly spending
Vectorized feature engineering: continuous variable binning
Statistics & Experimental Design Issues
Intuit takes A/B testing very seriously as they have been running quite a few statistical experiments in TurboTax.
Question 1: Design a new checkout function A/B test, define success indicators, sample size, and how to prevent results from being peeked in advance
Reference answer:
First, clarify the success indicator. The main indicator is usually the checkout conversion rate, and lock this main indicator before the experiment starts to avoid the multiple testing problems caused by repeatedly trying different indicators during the experiment. Sample size is estimated based on historical conversion rates and the minimum effect size expected to be detected, ensuring the experiment has sufficient statistical power. To prevent early peeking at the results, set the experimental period and stopping rules in advance, and only unify the analysis results after reaching the preset sample size or the end of the period. If necessary, use sequential inspection or corresponding correction methods to control the overall error rate.
Question 2: How to interpret p-values and confidence intervals? If p = 0.06, how do you conclude?
Reference answer
If p = 0.06, the result is not statistically significant at the commonly used 0.05 significance level, so the null hypothesis cannot be rejected. But I will not simply draw a conclusion based on this alone. Instead, I will combine the confidence intervals to see whether the direction and magnitude of the effect are business meaningful, and evaluate whether the statistical power of the experiment is sufficient. If the confidence interval still covers the range of the effect with practical value and the power of the experiment is low, then it is more likely that the sample size is insufficient and significance has not been reached; conversely, if the effect itself is small, even increasing the sample may not have business value.
Ask
- How to deal with the novelty effect
- How to balance the effects of seasonality
- How to deal with SRM (inconsistent sample proportions)
- The difference between t-test and z-test
Product Thinking & Case Analysis Questions
TurboTax Conversion Rate Drops 5%, How to Diagnose?
Problem-solving ideas:
Starting from the overall funnel, the tax filing process of TurboTax is divided into key steps, and the link where the decline in conversion rate mainly occurs is located; then user group analysis is performed, such as new and old users, different tax filing complexity, equipment or channel sources, to determine whether the decline is concentrated in specific groups or scenarios; on this basis, hypotheses are put forward based on possible recent changes in UI, process or pricing, and finally through design experiments or comparative verification, confirm which factors are really causing the decline in conversion rate.
Behavioral interview
- Projects you’ve driven with the greatest business impact
- How do you take responsibility when something goes wrong?
- How to Adjust from Setbacks
- How to influence others without managerial power
I recommend that everyone prepare 5-7 stories at a time, covering multiple themes, such as cooperation, resilience, etc. It is best to practice with recording to avoid sounding like you are memorizing the manuscript.
For students who are preparing for DS interviews
If you are preparing for a Data Scientist interview that focuses on products and business decisions, the difficulty is often not in answering questions, but in whether someone can help you align your analysis ideas with the interviewer's expectations. I'm here to do one-on-one for a long time Real-time interview assistance , some students who have cooperated have also successfully advanced to related positions such as Intuit, Meta, Amazon, etc. If you are repeatedly blocked in these links, you are welcome to contact us to see if it is suitable to make targeted preparations together.