Recent Two Sigma OA is being released one after another. I just finished one set today. My overall experience can be summarized in one sentence: the questions are neither too strange nor too strange. The difficulty is medium to basic, but the requirements for implementation details and mathematical understanding are relatively solid. My set of OA took about 20 minutes to complete, and I didn’t get particularly stuck. If my thoughts were clear, the rhythm was still very smooth. For students who haven’t done it yet or are waiting to see it, let me briefly break down the question types and core ideas of this OA for reference.
Two Sigma OA real question sharing & problem solving ideas
Question 1: Linear interpolator
The essence of this question is to implement a linear interpolation function with rules, but the details are much more than ordinary interpolation.
The core steps can be broken down into three steps:
The first step is to preprocess the data points.
All points need to be sorted by their x values first, and duplicate x's need to be handled. The rules are:
- When the input x ≤ the repeated x, take the corresponding minimum y
- When input x > this repeat x, take the maximum y
If you don't see this clearly, it's easy to directly WA at repeated points.
The second step is to locate the interval.
For a given input x, we need to determine which two known points it falls between:
- If it is within the interval, do normal interpolation
- If it is on the far left or far right, extrapolation is needed, using a straight line connecting the two nearest points.
The third step is to calculate the y value.
Just use the two-point straight line formula uniformly. The logic of extrapolation and interpolation is the same, but the intervals are different.
It's not difficult In summary, but it tests your ability to understand boundary conditions and rule descriptions, which is very Two Sigma.
Question 2: Analysis of urban temperature
This question is the one with the largest amount of information in the entire set of OA. There are 5 questions in total, centered around a multi-town temperature data set.
Q1: Find the town with the largest temperature variation
The idea is very straightforward. Calculate the standard deviation of the temperature for each town and take the one with the largest standard deviation.
Q2: Conditional filtering + median
Filter the records whose Town2 temperature is in the range of 90-100, then take the corresponding NYC temperature median, and finally round it up according to the meaning of the question.
Q3: Linear regression (with intercept)
Perform a linear regression for each town and NYC in the form
Y = a + b x
Use the least squares method to calculate the coefficients, and finally calculate |a| + |b| for each town and return the maximum value.
Q4: A single town has the smallest MSE
It is still a regression problem. Model each town separately and NYC, calculate the predicted MSE, and find the town with the smallest error.
Q5: The two-town combination has the smallest MSE
Take the two towns as independent variables, regress NYC together, and find the group that minimizes MSE.
This question is very obviously intended to test your overall understanding of statistics, regression modeling, and error evaluation, rather than a single point of skill.
Question 3: Asset Beta Calculation
This question is very typical of Two Sigma / Quant OA.
The first part is the basic beta calculation.
Given a set of asset returns x and y, calculate the slope β of the intercept-free linear regression. The essence of the formula is:
β = Σ(xy) / Σ(x²)
The second part is online calculation.
Data comes in batches, and when it is output for the jth time, β is required to be recalculated using all the data from the 1st batch to the jth batch.
If all historical data is traversed again in each round, the time complexity will explode, so the key is optimization.
The correct idea is to maintain cumulative statistics, such as:
- Accumulated Σx²
- Cumulative Σxy
Once the statistics are updated in each batch, the new β can be calculated in O(1), which is a very classic online algorithm idea.
Overall evaluation & suggestions for future students
The feel of this set of Two Sigma OA is clear:
- Don’t pursue fancy algorithms
- Pay more attention to the robustness of mathematical modeling, statistical understanding and engineering implementation
- Very demanding understanding of boundary conditions, efficiency, and formulas
If you are familiar with probability statistics, linear regression, and least squares, you will be very comfortable with this set of questions; on the contrary, it will be more difficult to do it temporarily.
Two Sigma: Don’t do it alone in the interview, the efficiency will be too low
If you are currently stuck in the OA or VO stage, the question types of Two Sigma are actually highly predictable and highly decomposable. We have recently helped many students go through Two Sigma's OA + VO process, including real-time idea calibration, statistical derivation reminders, and rhythm control of dismantling complex regression questions. The overall pass rate will be much more stable. Welcome to communicate directly, OA / Interview assistance We can all talk.