Data scientist Interview Questions in San Francisco, CA, US
data scientist interview questions shared by candidates
Top Interview Questions
Data Scientist at Facebook was asked...
You're about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it's raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of messing with you by lying. All 3 friends tell you that "Yes" it is raining. What is the probability that it's actually raining in Seattle? 38 AnswersBayesian stats: you should estimate the prior probability that it's raining on any given day in Seattle. If you mention this or ask the interviewer will tell you to use 25%. Then it's straight-forward: P(raining | Yes,Yes,Yes) = Prior(raining) * P(Yes,Yes,Yes | raining) / P(Yes, Yes, Yes) P(Yes,Yes,Yes) = P(raining) * P(Yes,Yes,Yes | raining) + P(not-raining) * P(Yes,Yes,Yes | not-raining) = 0.25*(2/3)^3 + 0.75*(1/3)^3 = 0.25*(8/27) + 0.75*(1/27) P(raining | Yes,Yes,Yes) = 0.25*(8/27) / ( 0.25*8/27 + 0.75*1/27 ) **Bonus points if you notice that you don't need a calculator since all the 27's cancel out and you can multiply top and bottom by 4. P(training | Yes,Yes,Yes) = 8 / ( 8 + 3 ) = 8/11 But honestly, you're going to Seattle, so the answer should always be: "YES, I'm bringing an umbrella!" (yeah yeah, unless your friends mess with you ALL the time ;) I thought about this a little differently from a non-bayes perspective. It's raining if any ONE of the friends is telling the truth, because if they are telling the truth then it is raining. If all of them are lieing, then it isn't raining because they told you that it was raining. So what you want is the probability that any one person is telling the truth. Which is simply 1-Pr(all lie) = 26/27 Anyone let me know if I'm wrong here! Here's another perspective on how to answer a question like this: Bring an umbrella. It's Seattle - if it's not raining right now, it probably will be by the time you get there. Show more responses I flagged Nub data scientist's answer as useful, because it shows an interesting flaw in reasoning. The 3 random variables are not to be treated as intrinsically independent. Only conditioned on the truth (raining/not raining) are they independent. Isn't the answer 2/3. The key thing is that they are ALL saying "Yes". You can't have all 3 says yes and have some people lying and some people telling the truth. It either is raining or it isn't. Not both. They either are all lying or all telling the truth. Since they are all in agreement (all lying or all truthful), they are essentially voting as one person. What is the probability that one person is telling the truth? 2/3 Answer from a frequentist perspective: Suppose there was one person. P(YES|raining) is twice (2/3 / 1/3) as likely as P(LIE|notraining), so the P(raining) is 2/3. If instead n people all say YES, then they are either all telling the truth, or all lying. The outcome that they are all telling the truth is (2/3)^n / (1/3)^n = 2^n as likely as the outcome that they are not. Thus P(ALL YES | raining) = 2^n / (2^n + 1) = 8/9 for n=3 Notice that this corresponds exactly the bayesian answer when prior(raining) = 1/2. TLP and nub data scientists, Your answers include possibilities which are not feasible; we cannot have any combination of 2/3 and 1/3 together... what about (2/3)^3? I agree with TLP and nub scientist. For me, the question is really (1 - the odds that all three of your friends are lying to you) Clearly 1 - 1/3 * 1/3 * 1/3. It's convenient that they all gave the same answer, otherwise it would be more difficult. Let Y denote rain, N denote no rain Actual Answer probability ------------------------------------------ Y=> 8/27 YYY, 1/27 NNN, 12/27 YYN, 6/27 YNN N=> 1/27 YYY, 8/27 NNN, 6/27 YYN, 12/27 YNN So, P(Y|YYY) = (8/8+1) = 8/9 The probability of raining is that they are all telling the truth, therefore, (2/3)^3. 26/27 is incorrect. That is the number of times that at least one friend would tell you the truth (i.e., 1 - probability that would all lie: 1/27). What you have to figure out is the odds it raining | (i.e., given) all 3 friends told you the same thing. Because they all say the same thing, they must all either be lying or they must all be telling the truth. What are the odds that would all lie and all tell the truth? In 1/27 times, they would the all lie and and in 8/27 times they would all tell the truth. So there are 9 ways in which all your friends would tell you the same thing. And in 8 of them (8 out of 9) they would be telling you the truth. Show more responses There is an obvious conceptual reason as to why several answers here (ones that don't use Bayes' formula) are incorrect. The probability in question has to depend on the probability of rain in Seattle. If, for the sake of discussion, it ALWAYS rains in Seattle, i.e. P(rain)=1, then the required prob. is always 1 as well. Likewise if it's a place where it never rains, or if the question asks about the prob. of it raining elephants given the 3 friends said yes, it'd be still 0. I believe this is a std. textbook example of the Bayes' formula, anything short of that I don't think will work out. Please correct me if incorrect. But I would just prefer to condition. either they are all telling the truth and its it raining or they are all lying and it is not raining. P(rain)=P(rain|truth,truth,truth)*P(truth,truth, truth)+P(rain|lie,lie,lie)*P(lie,lie,lie) notice that truth does not mean yes it is raining, it simply corresponds to them telling the truth. Since they said yes, IF they were lying and we knew they were lying then the probability of rain would be zero, thus eliminating the second term. P(rain)=P(rain|3xtruth)*P(3xtruth) and the probability of the truth is (2/3)^3 and the probability of rain if they are telling the truth is 1. I did a little skipping of steps, since truth doesnt equal yes, but i just sort of meshed it toegher towards the end YES=yes,yes,yes T=truth, truth, truth L=lie,lie,lie P(Rain|YES)=P(Rain|YES,T)*P(T)+P(Rain|YES,L)*P(L) P(Rain|YES,L)=0==> whats the probability of rain given we know that they are lying and theyve told us it is raining. P(Rain|YES)=P(Rain|YES,T)*P(T) P(Rain|YES,T)=1==> whats the probability of it raining given that they are telling the truth and have told us its raining then P(T)=(2/3)^3 its obvious. why in the world would i do bayesian methods when its certain I agree with (2/3)^3. Interview Candidate solves this problem using Bayesian stats despite the fact that no enough information is given to do Bayesian probability analysis i.e. he had to pull the probability of it raining in Seattle out of thin air when it was not given in the interview question. With only the information from the interview question, we have to assume that friends are either all lying or all telling the truth. Let truth=T and lie=L P(TTT)=8/27, P(LLL)=1/27, P(TLL)=2/27,P(TTL)=4/27. But we know that they all had the same answer, so we must compare P(TTT) to P(LLL). P(TTT) is 8 times more likely than P(LLL), so we have P(All same answers|TTT)=8/9, P(All same answers|LLL)=1/9. Therefore the solution given ONLY THE INFORMATION GIVEN is P(Rain)=8/9, P(Dry)=1/9. This problem requires the marginal probability of rain to solve, following Interview Candidate's answer. M.B. provides the rationale behind why the bayes approach is necessary: if the pr(rain) = 0, then the pr(rain|y, y, y) = 0. (maybe it is July in Seattle). A few conceptual problems in many answers that I want to point out: 1) There is lots of conflation between Pr(truth) and Pr(Y). Pr(truth) = Pr(Y|R) does not equal Pr(Y). 2) Consider there is only a single friend and they say yes, the logical conclusion from a lot of these answers is that Pr(Rain|Yes) = Pr(Yes|Rain) = 2/3, which is not correct. Bayes' rule is very clear in this simpler case. 3) The friends' answers are conditionally independent assuming no collusion. The combinations of their honesty/lying adds no additional information. The marginal probabilities are not independent, Pr(y,y,y) does not equal pr(y)^3, it equals pr(y,y,y,rain) + pr(y,y,y, no rain), the integration of the joint space over rain. Using conditional independence and bayes rule, this becomes: pr(y|rain)^3*pr(rain) + pr(y|no rain)^3(1-pr(rain)). A more general solution using Pr(rain) = r. Pr(rain|y,y,y) = Pr(y,y,y|rain)*pr(rain)/pr(y,y,y) #Bayes' formula pr(y,y,y|rain) = pr(y|rain)^3 = (2/3)^3 #conditional independence pr(y,y,y) = pr(y|rain)^3*pr(rain) + pr(y|no rain)^3*pr(no rain) #by definition, see point 3 the answer: r*(2/3)^3 / [r*(2/3)^3 + (1 - r)*(1/3)^3] It should be (2/3)^3, I think zen and todo is correct. Most of the answers/comments made all unconditional assumptions except a few reasonings that lead to the 8/9 probability. Note that the question states that "Each of your friends has a 2/3 chance of telling you the truth". This essentially means P(raining, yes) + P (non-raining, no) = 2/3. Any attempts to interpret this as conditional probability P(raining | yes) = 2/3 or P(yes | raining) = 2/3 are making other assumptions. Show more responses 8/27 is not the answer. For the weather to be nice in this case, all 3 of your friend NEED to have lied to you. Therefor the odds are 1/27. What if the answer is 50% since the chance of rain and not rain does not depend on what your friends tell you. In the absence of further information, the only correct answer is the posterior probability of rain p is in the interval (0, 1). In the absence of further information any prior is as good as any other, so by implication the posterior can take any value as well. The interval for p can be restricted to [0, 1] on the assumption that the question to the friends would not be posed if the prior is absolute certainty whether it will rain or not. With the further assumption that the prior probability is measured with limited precision (e.g. rounded to a percentage point), the posterior would be in the interval (0,075, 1). If the alternative assumption is made that information from the friends will be requested only if it had any chance to move the posterior below or above 0.5, the posterior interval for the probability is (0.5, 1). any more precise answer than that requires further information about the prior which is not supplied in the original problem formulation. Also note that even a precise answer about the probability of rain is not sufficient to answer the question whether an umbrella should be brought or not. The probability of each of the friend say "YES" is 2/3 * 2/3 * 2/3 = 8/27. Now the probability that it is actually raining in Seattle depends on that how do I select them to phone. There is only three way to select and phone them. So, the probability that it is actually raining in Seattle is 3 * (8/27) = 8/9. Rule of conditional probability states P(A|B) = P( A & B ) / P(B) Reformulating to this case, P(Rain | 3Y) = P(R & 3Y) / P(3Y) P(R & 3Y) = 2/3 ^3 (if it is raining, then they must all speak the truth) = 8/27 (one could multiply probability of rain here. I assumed as prior) P(3y) = all truth or all lie = 2/3 ^ 3 + 1/3 ^3 = 9/27 hence P(R | 3Y) = 8/9 Let X be the probability it's raining. Obviously we want P(X|all three say yes). Now let Y be the probability at least one of them is lying. If Y = 0 it's easy to solve, if not then not so easy. Now you keep going. Obvious, bayesian is a way to go... Show more responses There is a way to easily confirm the right answer. Just write a computer simulation and run it a few million times, which I did. If the long term chance of rain in Seattle is 25%, the chance it is raining now, given the YYY answers and the 2/3 truth 1/3 lying, is 73% (rounded to whole number), which is the same as 8/11, so the reasoning with the Bayesian math is correct. This can easily be solved without Bayes: There are two cases: Case 1: It is raining and all friends are telling the truth: 0.25*(2/3)^3 = 1/4*8/27 Case1: It is not raining and all friends are lying: 0.75*(1/3)^3 = 3/4*1/27 Probability: P(E) = Case1 / (Case1+Case2) = (1/4*8/27) / (3/4*1/27 + 1/4*8/27) = 2 / (11/4) = 8/11 Closest points One or more comments have been removed. |
Write a SQL query to compute a frequency table of a certain attribute involving two joins. What if you want to GROUP or ORDER BY some attribute? What changes would you need to make? How would you account for NULLs? 26 AnswersIn my case this question was like: 'you have a table Submissions with the submission_id, the body, and the parent_id. Submissions can be posts, or comments to a post. In posts, parent_id is null, and in comments, the parent_id is the post the comment is commenting about. How would you go and make a histogram of number of posts per comment_count?' I think i solved it along the lines of: SELECT comment_counts.n_comments, count distinct(n_comments.submission_id) ( select s1.submission_id, COUNT DISTINCT(s2.parent_id) as n_comments OUTER join submissions on s1.submission_id = s2.parent_id group by submission_id) comment_counts GROUP BY comment_counts.n_comments Can you explain why you would even need the self-join here? Can you not just group by parent_id and do the COUNT() on each group, since the parent_id values correspond to the post values when they're not null? Show more responses If you group by parent_id, you'll be leaving out all posts with zero comments. select number_comments, count(submission_id) as number_posts from ( # more than zero comments select submission_id, count(post_id) as number_comments from ( select submission_id, case when parent_id is null 1 else 0 end as post, case when parent_id is not null parent_id else null end as post_id, body from Submissions )k where post =0 group by submission_id ) k1 group by number_comments union select number_comments, count(submission_id) as number_posts from ( # comments= 0 select submission_id, 0 as number_comments from ( select submission_id, case when parent_id is null 1 else 0 end as post, case when parent_id is not null parent_id else null end as post_id, body from Submissions )k where post =1 group by submission_id ) k1 group by number_comments @ RLeung shouldn't you use left join? You are effectively losing all posts with zero comment. select k.post_id, count(submission_id) -1 from (select submission_id, case when parent_id is null then submission_id else parent_id end as post_id from submissions) t group by post_id select t.post_id, count(t.submission_id) -1 from (select submission_id, case when parent_id is null then submission_id else parent_id end as post_id from submissions) t group by post_id select parent_id as post, count(parent_id) as num_of_comments from submissions group by parent_id union select submission_id as post, 0 as num_of_comments from submissions where parent.id=null select comments_count, count(submission_id) as post_count from ( select submission_id, count( distinct parent_id) as comments_count from Table A group by submission_id )A group by comments_count I think all of the Posts are missing Parent_ID. I am editing the code shared above. This will solve the duplicate problem select parent_id as post, count(parent_id) as num_of_comments from submissions group by parent_id union select submission_id as post, 0 as num_of_comments from submissions where parent.id not in (select submission_id from submissions) Here is the solution. You need a left self join that accounts for posts with zero comments. Select children , count(submission_id) from ( Select a.submission_id, count(b.submission_id) as children from Submissions a Left Join submissions b on On a.submission_id=b.parent_id Where a.parent_id is null Group by a.submission_id ) a Group by children I've tested all these on a mock data set and none of them work! Does anyone have the correct solution? I'm stuck on this one.. Show more responses Posts and comments in the same table looks weird. Here's my attempt (made easy with CASE) to exclude all the posts from the table and grouping/counting comments. SEL parent_id ,COUNT(*) as comment_count ( SEL * ,CASE WHEN perent_id IS NULL THEN 'Post' ELSE 'comment' END as post_or_comment FROM Submissions ) a WHERE post_or_comment = 'comment' I think it is pretty straight forward. All the posts will have null parent_id. Considering the table schema to be something like this: CREATE TABLE submissions ( submission_id INT, body VARCHAR(500), parent_id INT ); SELECT DISTINCT nvl(parent_id::TEXT,'Post with no comments') AS post_id, COUNT(CASE WHEN parent_id IS NOT NULL THEN submission_id ELSE 0 END) AS number_of_comments_or_post FROM submissions GROUP BY 1; This will give results like this: post_id number_of_comments_or_post Post with no comments 8 1 10 7 11 13 8 19 9 25 7 So, the first row will give the number of posts with no comments which is 8 and remaining rows tell the number of comments per post. Is there a flaw in this? Not the shortest answer but I think much clearer than anything posted here. Also gives output table that could actually be fed directly into a histogram which was part of the question. SELECT CASE WHEN num_comments IS NULL THEN 0 ELSE num_comments END AS num_comments, COUNT(parent_post_id) AS cnt_posts FROM ( SELECT submission_id AS parent_post_id, comment_count.num_comments FROM Submissions WHERE parent_id IS NULL LEFT JOIN ( SELECT parent_id, COUNT(parent_id) AS num_comments FROM Submissions WHERE parent_id IS NOT NULL GROUP BY 1 ) comment_count ON submission_id = comment_count.parent_id ) GROUP BY 1 ORDER BY 1 select p.parent_id as posts, count(c.submission_id) as commentcount from submissions c inner join submissions p on c.parent_id = p.submission_id group by p.parent_id; select case when parent_id is not null then parent_id else sub_id end as post_id, sum(case when parent_id is not null then 1 else 0 end) as comment_count from submissions group by case when parent_id is not null then parent_id else sub_id end; Create table: create table submissions ( submission_id int null, body varchar(500) null, parent_id int null ); Insert records: (change your database name) INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (1, 'POST1', null); INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (null, 'C1', 1); INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (2, 'POST2', null); INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (3, 'POST3', null); INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (null, 'C2', 3); INSERT INTO employees.submissions (submission_id, body, parent_id) VALUES (null, 'C3', 3); Solution: SELECT a.submission_id AS post_id, a.body, sum(CASE WHEN t.parent_id > 0 THEN 1 ELSE IFNULL(t.parent_id,0) END) AS comment_id FROM submissions AS a LEFT JOIN (SELECT b.parent_id FROM submissions AS b) t ON a.submission_id = t.parent_id WHERE a.submission_id IS NOT NULL GROUP BY post_id; Results: 1 POST1 1 2 POST2 0 3 POST3 2 CREATE TABLE users( sid INT , pid INT , body Varchar(255)); insert into users Values ( 2,null, "cover"), (1,2,"Ami is"),(3,2,"hi"),(4,2,"good pic"),(5,null ,"profil pic"),(6,5,"nice"); (select pid , COUNT(pid) as total from users where pid is not null group by pid) create table subs( sub_id integer, parent_id integer ) insert into subs values(1,null); insert into subs values(2,null); insert into subs values(3,null); insert into subs values(4,null); commit; insert into subs values(5,1); insert into subs values(6,1); insert into subs values(7,1); insert into subs values(8,1); insert into subs values(9,2); insert into subs values(10,2); insert into subs values(11,3); insert into subs values(12,3); insert into subs values(12,4); commit; select * from subs select cc, count(sub_id) from ( select a.sub_id, count(b.sub_id) cc from subs a inner join subs b on(b.parent_id = a.sub_id) group by 1) group by 1 I found it easier to explain when I broke it out into named sub tables to handle the case when there are no comments on a post and you want the end result to be the histogram of the number of comments per post: with parent_comment_ct as ( SELECT parent_id, COUNT(parent_id) AS num_comments FROM submissions WHERE parent_id IS NOT NULL GROUP BY parent_id ), submission_comment_ct as ( SELECT su.submission_id AS parent_post_id, pcc.num_comments AS num_comments FROM submissions su LEFT JOIN parent_comment_ct pcc ON su.submission_id = pcc.parent_id WHERE su.parent_id IS NULL ) SELECT CASE WHEN scc.num_comments IS NULL THEN 0 ELSE scc.num_comments END AS num_comments, COUNT(scc.parent_post_id) AS cnt_posts FROM submission_comment_ct scc GROUP BY 1 ORDER BY 1 SELECT recommended_page FROM (SELECT f.user1_id as users, f.user2_id as freinds, l.page_id as recommended_page FROM friendship f INNER JOIN likes l ON f.user2_id = l.user_id WHERE f.user1_id = 1 UNION ALL SELECT f.user2_id as users,f.user1_id as friends,l.page_id as recommended_page FROM friendship f INNER JOIN likes l ON f.user1_id = l.user_id WHERE f.user2_id = 1) MINUS (SELECT page_id as recommended_page FROM likes WHERE user_id = 1); Show more responses SELECT recommended_page FROM (SELECT f.user1_id as users, f.user2_id as freinds, l.page_id as recommended_page FROM friendship f INNER JOIN likes l ON f.user2_id = l.user_id WHERE f.user1_id = 1 UNION ALL SELECT f.user2_id as users,f.user1_id as friends,l.page_id as recommended_page FROM friendship f INNER JOIN likes l ON f.user1_id = l.user_id WHERE f.user2_id = 1) MINUS (SELECT page_id as recommended_page FROM likes WHERE user_id = 1); select c.subID as SubmissionID, count(c.body)-1 as Counts_Comments from subm c LEFT JOIN subm b ON c.subID = b.pID where b.pID is null AND c.pID is NULL group by c.subID UNION ALL select a.pID as SubmissionID, count(a.body) as Counts_Comments from ( select *, case when pID IS NULL then 'P' Else 'C' END as P_O_C from subm)a where P_O_C = 'C' group by a.pID Order by SubmissionID; select a.user_name,b.user_name,page_liked from services_db.pages_liked a, services_db.user_friends b where 1=1 and a.user_name = b.friend_user and a.page_liked not in ( select page_liked from services_db.pages_liked c where 1=1 and c.user_name = b.user_name ) ; One or more comments have been removed. |
Data Scientist at Facebook was asked...
Given an list A of objects and another list B which is identical to A except that one element is removed, find that removed element. 20 AnswersSelect * from A except Select * from B I think it is a coding in algorithm rather than SQL query. So here is my take: def ret_miss(A, B): k = len(A) if k == 2: if A[1] == B[0]: return A[0] elif A[0] == B[0]: return A[1] n = k/2 print A[n], B[n] if A[n] == B[n]: A= A[n:] B=B[n:] else: A=A[:n+1] B=B[:n+1] print A,B return ret_miss(A,B) This works nicely actually. Show more responses In Python: (just numbers) def rem_elem_num(listA,listB): sumA = 0 sumB = 0 for i in listA: sumA += i for j in listB: sumB += j return sumA-sumB (general) def rem_elem(listA, listB): dictB = {} for j in listB: dictB[j] = None for i in listA: if i not in dictB: return i How about this in python, will this work? x = set(listA)-set(listB) print(x) All these supposed answers are missing the point, and this question isn't even worded correctly. It should be lists of NUMBERS, not "objects". Anyway, the question is asking how you figure out the number that is missing from list B, which is identical to list A except one number is missing. Before getting into the coding, think about it logically - how would you find this? The answer of course is to sum all the numbers in A, sum all the numbers in B, subtract the sum of B from the sum of A, and that gives you the number. # python code def missing_obj(original_lst, new_lst): for x in new_lst: original_lst.remove(x) return original_lst select b.element from b left join a on b.element = a.element where a.element is null two ways to do it using sql: 1. select * from A where not in (select * from B) -- assuming you know what element you're looking for 2. select * from (select * from A UNION select * from B) having count(element) = 1 -- again assuming you know the element In R: removed_element <- A[which(!A %in% B)] removed_element Depends on the kind of elements in the lists. If they're numbers, sum(A) minus sum(B) will give the missing element. If they're characters/strings, just dump the elements of A into a dictionary and check each element in B for existence in A. [i for i in A if i not in [j for j in B]] SQL: select a.list as a, b.list as b from ListA as a full join ListB as b on a.list = b.list where a.list eq '' OR b.list eq "" ; Show more responses missing_letters = [] for letter in A: if letter in B: pass else: missing_letters.append(letter) print (missing_letters) Python: sum(A)-sum(B) In SQL: SELECT A.object FROM A LEFT JOIN B ON A.object = B.object WHERE B.object IS NULL; Careful, there could be a repeated object that's being removed. i.e. A = [3, 4, 5, 6, 5] B = [3, 4, 6, 5] This is how I would do it on Python (works for numbers and strings) def missingval(lA, lB): a = sorted(lA) b = sorted(lB) c = None for i in range(len(b)): if a[i] != b[i]: c = a[i] break if c is None: c = a[-1] print(c, "was removed from list A") A = [1,2,3,4,5,6,7,8] B = [1,2,3,4,5,6,8] [i for i in A if i not in B] find the sum of the two list and subtract. ans = sum(a) - sum(b) where a and b are list of numbers. One or more comments have been removed. |
Data Scientist at Facebook was asked...
Consider a game with 2 players, A and B. Player A has 8 stones, player B has 6. Game proceeds as follows. First, A rolls a fair 6-sided die, and the number on the die determines how many stones A takes over from B. Next, B rolls the same die, and the exact same thing happens in reverse. This concludes the round. Whoever has more stones at the end of the round wins and the game is over. If players end up with equal # of stones at the end of the round, it is a tie and another round ensues. What is the probability that B wins in 1, 2, ..., n rounds? 19 AnswersKey is to define the problem correctly. Let Na be the # player A rolls, and Nb be the number player B rolls. Then at the end of the first round A will have (8 + Na - Nb) and B will have (6 - Na + Nb) stones. So all you need is to compute Pr{6 - Na + Nb > 8 + Na - Nb} for round 1 victory. Subsequent rounds are even easier since all subsequent rounds can only start when the stone count is equal. I got [1/6, 15/36, 15/36, 15/36, ...] For B to Win 6 - Na + Nb > 8 +Na - Nb or, Nb - Na > 1 which can happen if (B,A) is (3,1),(4,1),(4,2),(5,1),(5,2),(5,3),(6,1),(6,2),(6,3),(6,4) Hence prob of B winning is = 10/36 = 5/18 For any round to ensue, the prior round has to end in a draw, or 6 - Na + Nb = 8 +Na - Nb or, Nb - Na = 1, for which prob = 4/36 = 1/9 Once equal, B's prob of winning is if total stones with B > total stones with A or 13/36 possibility of another draw = 1/6 For B to win in nth round, prob = 1/9x1/6^(n-2)x13/36 Show more responses For B to Win 6 - Na + Nb > 8 +Na - Nb or, Nb - Na > 1 which can happen if (B,A) is (3,1),(4,1),(4,2),(5,1),(5,2),(5,3),(6,1),(6,2),(6,3),(6,4) Hence prob of B winning is = 10/36 = 5/18 For any round to ensue, the prior round has to end in a draw, or 6 - Na + Nb = 8 +Na - Nb or, Nb - Na = 1, for which prob = 5/36 Once equal, B's prob of winning is if total stones with B > total stones with A or 13/36 possibility of another draw = 1/6 For B to win in nth round, prob = 5/36x1/6^(n-2)x13/36 I don't get it. Shouldn't prob of B winning given it's tie at 1st round be 15/36? given it's tie at 1st round, at the 2nd round Nb > Na can happen if (B,A) is (2,1), (3,1/2),(4,1/2/3), (5,1/2/3/4),(6,1/2/3/4/5), which totals 15 out of 36. I am getting a bit different result. In first round A player (8 stones) has 6 certain ways to win where the sum of both dices is (5,4 or -4,-5 when B draws). Since there is 36 ways in total, i took that 50% of those can be a tie. So, P of winning for A player in first round is: (6+15)/36=21/36. What am I missing? Mistake, P for A player to win in first round might be: (6+10)/36 based on 6 ways to win for sure, 10 to tie, 10 to lose and 10 to win from the 36-6 ways left. Depending on what A rolls (1-6), and then what B rolls (1-6), you are given 36 different possibilites: and then you can make a web out of that to see in what scenarios B will win in round 1. B has a 10/36 chances to win in round 1: (A rolls 1 and B rolls above 4) + (A rolls 2 and B rolls above 3) + ... A has a 21/36 chance to win in round 2 using the same idea and they will tie with 5/36 chances. If they tie, then the game goes to round 2 with A and B having the same number of chips, and so after that it'll be 1/2 chances. So many answers...Here's my version: For round1, B win only if it gets 3 or more stones than A, which is (A,B) = (1,4) (1,5) (1, 6) (2, 5) (2,6) (3,6) which is 6 cases out of all 36 probabilities. So B has 1/6 chance to win. To draw, B needs to get exactly 2 stones more than A, which is (A, B) = (1,3) (2,4) (3,5) (4,6) or 1/9. Entering the second round, all stones should be equal, so the chance to draw become 1/6, and the chance for either to win is 5/12. So the final answer is (1/6, 1/9*5/12, (1/9)^2*5/12, .....(1/9)^(n-1)*5/12) ) Because at the beginning time, A has 8 and B has 6, so let A:x and B:y, then A:8+x-y and B:6-x+y; so there are 10/36 prob of B wins. And A wins prob is 21/36 and the equal prob for next round is 5/36. So for B wins at round prob is 10/36. And if they are equal and to have another round, the number has changed to 7 and 7. So A:7+x-y and B:7-x+y, so this time B wins has prob 15/36 and A wins has prob 15/36. And the equal to have another round is 6/36=1/6. So overall B wins in 2 rounds has prob 5/36*15/36. And for round 3,4,...etc, since after each equal round, the number will go back to 7 and 7 so the prob will not change. So B wins in round 3,4,...n has prob 5/36*(6/36)^(r-2)*15/36. r means the number of the total rounds. On the first round, B can win if (A,B) rolls: (1,4) , (1,5) , (1,6) , (2,5) , (2,6) , (3,6) - so there are 6 out of 36 possibilities where B wins. P(B1) [read, probability that B wins in round 1] = 1/6 On the first round, B can tie if (A,B) rolls: (1,3) , (2,4) , (3,5) , (4,6) - so there are 4 out of 36 possibilites where B ties. If B ties with A, there is a second round, where B wins with probability 15/36, A wins with probability 15/36 = 5/12, and they tie with probability 6/36. So P(B2) = 1/9 * (5/12) After the second round, the game only continues if both players have equal number of points, and the probability of a tie in each game is 1/6, so P(B3) = (1/9)*(1/6)*(5/12) Generally, P(Bn) = (1/9)*(1/6)^(n-2)*(5/12) B wins only: if A throws 4 and B throws 6. if A throws 3 and B throws 5 or 6. if A throws 2 and B throws 4, 5 or 6. if A throws 1 and B throws 3,4, 5 or 6. Hence, B wins 10/36. Also the chance of tie is 5/36 ( all events there is one possibility of tie except when A throws 6). B has to throw one greater than A. After the first throw there are only 14 (hence 7 each). Now the chances of tie are when A and B throw the same. Which means 6/36. But probability of A and B winning are the same hence that would be 15/36. Show more responses Build a 6x6 table of possible rolls and outcomes. For round 1: 5 result in a tie -> P(tie) = 5/36 21 result in A winning -> P(Awins) = 21/36 10 result in B winning -> P(Bwins) = 10/36 For round 2: (which can only start if they are tied with 7 stones each) 6 result in a tie -> P(tie) = 6/36 15 result in A winning -> P(Awins) = 15/36 15 result in B winning -> P(Bwins) = 15/36 To generalize the formula for B wins in n rounds: P(BinfirstRound) = (5/36) P(BinNrounds; N>1) = (5/36)*(1/6)^(n-2)*15/36 This is a poorly worded question. Probability of the game being over in Round 1 is 10/36. Do they continue playing after round 1 ? However, if the game has to go one, the probability is 15/36 (difference between A and B is equal and greater than 1). The game will keep going only if it is 7-7, the probability of 7-7 after round 1 is 6/36. How can B win in so many rounds ? Should it not be over the moment B wins a round and has more stones ? What am I missing ? Let x be what A rolls and y be what b rolls B wins in first round if: 6+y-x > 8+x-y 2y > 2 + 2x Or y > 1 + x Possible scenarios: (1,3) (1 ,4) (1,5) (1,6) (2,4) (2,5) (2,6) (3, 5) (3,6) (4,6) Probability(B wins in 1st round) = 10/36 The game draws in first round if : y = 1+x (1,2) (2,3) (3,4) (4,5) (5,6) Probability (draws in 1st round) = 5/36 Probabilty (draw in 2nd round ) = 6/36 Now with equal stones, probability of B winning all events where y > x 5+4+3+2+1 = 15 P(b wins in R2) = 15/36 Total probability that B wins the game = 10/36 + (5/36) * (6/36)*n-3 * 15/36 Bi - player B wins the ith round Ti - round i is a Tie = player A got m, player B got n We want to find the probability: P(T1, T2, ..., Ti-1, Bi) # ( , ) means AND = P(Bi | T1, ..., Ti-1)*P(T1)*...*P(Ti-1)= (P()+,,, +P())*(4/6*1/6)*(1/6)*...*(1/6) #(...) = i-2 times # P()+,,, +P() here we sum all the probabilities where player B wins #(4/6*1/6) getting Tie at the first round # (1/6)*...*(1/6) here we multiply i-2 cases were we got tie after a tie One or more comments have been removed. |
Data Scientist at Facebook was asked...
We have two options for serving ads within Newsfeed: 1 - out of every 25 stories, one will be an ad 2 - every story has a 4% chance of being an ad For each option, what is the expected number of ads shown in 100 news stories? If we go with option 2, what is the chance a user will be shown only a single ad in 100 stories? What about no ads at all? 16 AnswersExpected number of ads in option 1 is 4. Expected number of ads in option 2 is 0.04 * 100, which is also 4. For chance of getting one add is ~1.4% Bin(100, 0.04,1) Bin(100, 0.04, 0) Bin(n,p,r) = Binomial distribution for n trials, p probability of being ad and r number of ads. Show more responses Chance of getting exactly one add is ~7% As the formula is (NK) (0,04)^K * (0,96)^(N−K) where the first (NK) is the combination number N over K Most of the answers are wrong for option #2. The expected value for option # is NOT 4 ads! you should rather create a probability distribution. P(ads = 0) = 1.69% P(ads = 1) = 7.03% P (ads = 2) = 14.5% P(ads = 3) = 19.73% P (ads = 4) = 19.94% P(ads = 5) = 15.95% .... Most of the P (ads >= 9) = ~0% and the total should add upto 100% And if you are struggling about why that's the case then you should google "binomial distribution". We have similar questions on MockInterview(dot)co if you want to practice. For "MockInterview dot co": The binomial part is correct but you argue that the expected value for option 2 is not 4 but this is false. In both cases E(x) = np = 100*(4/100) = 4 and E(x) = np=100*(1/25) = 4 again. what is the expected mean and standard deviation to have back to back ads in those two method? 1. Calculate the probability of getting 1 ad and 99 no ads = (.04)^1 * (.96)^99 2. Calculate the number of combinations where order does not matter (meaning you can be served an ad on the 25th story or 1st story -- it does not matter) = 100! / (99!*1!) = 100 multiple 1 * 2 which gives you 7% of being served an ad. For the questions 1: I think both options have the same expected value of 4 For the question 2: Use binomial distribution function. So basically, for one case to happen, you will use this function p(one case) = (0.96)^99*(0.04)^1 In total, there are 100 positions for the ad. 100 * p(one case) = 7.03% Thanks Josito! you are right -- 4 is the asnwer, sorry for the confusion! Can someone explain why the answer for part 2 is not 4? What is the evidence that we should use binomial? For each option, what is the expected mean and variance a user will be shown 2 back-to-back ads? Show more responses 4. My question is does the 4 ads pay for the cost 100 news feeds? for the follow up: p = 0.04 1-p = 0.96 Ho: p=0.04 Ha: p<0.04 z_stat = 0.01-0.04/np.sqrt((0.04*0.96)/100)=-1.5 A whole lot of chance for 1 in 100 stories to happen.. cannot reject the null. One or more comments have been removed. |
Data Scientist at Facebook was asked...
There is a table that tracks every time a user turns a feature on or off, with columns user_id, action ("on" or "off), date, and time. How many users turned the feature on today? How many users have ever turned the feature on? In a table that tracks the status of every user every day, how would you add today's data to it? 12 AnswersVia SQL on whiteboard, sorry I'm not re-writing queries as best I remember in a text box. Select count(distinct user_id) as onusers From actions Where date = date() and action='on'; - Users turned the feature on today Feature is on or off yesterday, does not matter: select count(*) from table where action = ‘on’ and date = getdate() Feature was off yesterday: select a.count(*) from (sel * form table where date = getdate() qualify rank() over partition by user_id order by date desc, time desc) a inner join (sel * from table where date = get date() - 1 qualify rank() over partition by user id order by date desc, time desc ) b on a.user_id = b.user_id where a. action = “on” and b.action = “off” How any users have ever turned this on? select count(distinct(user_id)) from table where action = “on" Show more responses Any explanation on Why you used window function above? Any specific reason -- Assume the historical data is stored in a table user_status with columns: user_id, date, status -- users keep the same status for today select us.user_id, today(), us.status from (select user_id, status from user_status where dt = today()-1) us LEFT outer join (select user_id, action, max(time) from action where date = today() group by user_id, action) ts on us.user_id = ts.user_id where ts.user_id is NULL UNION ALL -- users changed status for today select user_id, today(), action from (select user_id, action, max(time) from action where date = today() group by user_id, action) a UNION ALL -- historical status select * from user_status -- Assume the historical data is stored in a table user_status with columns: user_id, date, status -- users keep the same status for today select us.user_id, today(), us.status from (select user_id, status from user_status where dt = today()-1) us LEFT outer join (select user_id, action, max(time) from action where date = today() group by user_id, action) ts on us.user_id = ts.user_id where ts.user_id is NULL UNION ALL -- users changed status for today select user_id, today(), action from (select user_id, action, max(time) from action where date = today() group by user_id, action) a UNION ALL -- historical status select * from user_status INSERT INTO history SELECT user_id, date, action AS status, a.time FROM a INNER JOIN (SELECT user_id, date, MAX(time) AS time FROM a GROUP BY user_id, date WHERE date = CURRENT_DATE())b ON a.user_id = b.user_id AND b.time = a.time WHERE date = CURRENT_DATE(); I haven't tested them, so take care. I considered that turn on today means that yesterday was off How many users turned the feature on today? select count(today.user_id) from log today join log yesterday on yesterday.date = DATE_ADD(CURDATE(), INTERVAL -1 DAT) and today.action = ‘on’ and yesterday.action = ‘off’ and today.user_id = yesterday.user_id where today.date = CURDATE() or (usually less efficient because of the subquery): select count(user_id) from log today where action = ‘on’ and date = CURDATE() and today.user_id in ( select user_id from log yesterday where action = ‘off’ and date = date_add(CURDATE(), INTERVAL -1 DAY) ) group by user_id How many users have ever turned the feature on? select count(distinct today.user_id) from log today join log yesterday on yesterday.date = DATE_ADD(today.date, INTERVAL -1 DAY) and today.action = ‘on’ and yesterday.action = ‘off’ and today.user_id = yesterday.user_id In a table that tracks the status of every user every day, how would you add today's data to it? INSERT INTO tracking (user_id, date, action) select user_id, date, action from log today where date = CURDATE() having time = max(time) group by user_id, date, action First question SELECT count(DISTINCT user_id) FROM table WHERE (date = yest() AND action = 'off') AND (date = today() AND action = 'ON') GROUP BY 1,2 Correction for ^ SELECT count(DISTINCT user_id) FROM table WHERE (date = yest() AND action = 'off') AND (date = today() AND action = 'ON') Users who have ever turned the feature on. that refers to all cases where the action is on in the table, irrespective of date or future and past action SELECT count(DISTINCT *) From table WHERE action = 'on' One or more comments have been removed. |
Data Scientist at Facebook was asked...
Write a function that takes in two sorted lists and outputs a sorted list that is their union. 10 Answersf(a,b) { return sort(unique(a,b)) } def sortedUnion(list1,list2): list3 = [x for x in list1 if x in list2] return sorted(list(set(list3))) google merge sort Show more responses write 2 helpers: 1) INSERT(A, b) = put element b within A in the sort order 2) DEL(A, a) = delete element a from A Then do this recursion: f(A,B) : if max(A) <= min(B) return [A B] else { B = INSERT(B, max(a)); A = DEL(A, max(a); f(A,B); } something like that. try coding and testing. I haven't. Oops, check/write a termination condition On Python, you could do: from sets import Set def merge_sort(a,b): return sorted( Set(a).union(Set(b)) ) def sorted_union(list1, list2): union=set(list1).union(set(list2)) sorted_union=sorted(list(union)) return sorted_union Second part of merge sort. Don't answer with sort(a), etc. Anyone can do that... def merge(A, B): i=0 j=0 sorted_list = [] while i < len(A) and j < len(B): if A[i] <= B[j]: sorted_list.append(A[i]) i += 1 else: sorted_list.append(B[j]) j += 1 if i < len(A): sorted_list.extend(A[i:]) elif j < len(B): sorted_list.extend(B[j:]) return sorted_list I assumed that we can not use any "sort" function and we want it with linear time. so here it is: def my_sort(list_a, list_b): if len(list_a) ==0: return list_b elif len(list_b) ==0: return list_a else: if list_a[-1] > list_b[-1]: return( my_sort(list_a[0:-1], list_b) + [list_a.pop(-1)]) else: return(my_sort(list_a,list_b[:-1]) + [list_b.pop(-1)]) In SQL SELECT List1 FROM Table1 UNION SELECT List2 FROM Table2 ORDER BY List1, List2; |
Data Scientist at Facebook was asked...
1) Provided a table with user_id and dates they visited platform, find the top 100 users with the longest continuous streak of visiting the platform as of yesterday. 2) Provided a table with page_id, event timestamp and a flag for a state (which is on/off), find the number of pages that are currently on. 9 AnswersI've been researching how to count consecutive values in SQL. How would you go about answering question 1? The answer to first question could look like this (returning more than 100 top users if there are more ranking same at 100th place): WITH user_streaks AS ( SELECT user_id, level as lvl, rank() over (order by level desc) rnk FROM table WHERE connect_by_isleaf = 1 START WITH date = trunc(sysdate) - 1 CONNECT BY PRIOR date = date + 1 AND PRIOR user_id = user_id ) SELECT user_id, lvl, rnk FROM user_streaks WHERE rnk <= 100 Self join on userid and date = date-1 Show more responses For question 2: With Max as (SELECT Page_id, MAX(timestamp) as MaxDate FROM page_status Group by Page_id) SELECT Count(*) From page_status as P INNER JOIN Max as M on P. Page_id=M. Page_id and P.timestamp=M.MaxDate WHERE state=’on’ Answer to question 1: WITH pop AS (SELECT user_id FROM users GROUP BY user_id HAVING MAX(CAST (ActionTime AS DATE))= DATEADD(DAY, -1,(CAST(GETDATE() AS DATE)) ), dates as (SELECT DISTINCT T.User_id, CAST(T.ActionTime AS DATE) AS date FROM TABLE AS T INNER JOIN pop AS P ON T.USER_ID=P.user_id ), groups AS ( SELECT dates.user_id, ROW_NUMBER() OVER (PARTITION BY User_id ORDER BY date) AS RN, DATEADD(DAY, -ROW_NUMBER() OVER (PARTITION BY User_id ORDER BY date),date) AS grp, dates.date FROM dates ) SELECT DISTINCT TOP 100 groups.User_id, COUNT(*) AS consecutiveDates FROM groups GROUP BY groups.User_id, groups.grp ORDER BY consecutiveDates DESC The key in generic questions like this, is to make sure to cover the fundamentals. There's usually a back-and-forth with the interviewer. Might be worth doing a mock interview with one of the Facebook Data Scientist experts on Prepfully? Really helps to get some real-world practice and guidance. prepfully.com/practice-interviews #1 is super tricky - I really like it as an interview question. So for question 1 there's a simple and rather elegant way of doing it where you take the max date when somebody came to the platform, the max date that the same person didn't come to the platform, and the difference between those 2 dates would actually give you the users with the longest streak as long as you filter for those users whose max date coming on the platform is today/ yesterday as needed. Select user_id, datediff(days, last_non_txn, last_txn) as from (Select c.*, case when d.date is not null then 1 else 0 end as txn_flag From (Select user_id, date from (Select distinct user_id from ) a Cross join (Select distinct date from ) b ) c Left join (select distinct user_id, date from ) d On c.user_id = d.user_id and c.date = d.date Order by c.user_id, c.date ) e Group by user_id ) f Where last_txn = ‘2020-04-21’ Order by streak_days desc limit 100 So for question 1 there's a simple and rather elegant way of doing it where you take the max date when somebody came to the platform, the max date that the same person didn't come to the platform, and the difference between those 2 dates would actually give you the users with the longest streak as long as you filter for those users whose max date coming on the platform is today/ yesterday as needed. Select user_id, datediff(days, last_non_txn, last_txn) as streak_days from ( Select user_id, max(case when txn_flag = 0 then date else null end) as last_non_txn, max(case when txn_flag = 1 then date else null end) as last_txn from ( Select c.*, case when d.date is not null then 1 else 0 end as txn_flag From ( Select user_id, date from ( Select distinct user_id from ) a Cross join ( Select distinct date from ) b ) c Left join ( select distinct user_id, date from ) d On c.user_id = d.user_id and c.date = d.date Order by c.user_id, c.date ) e Group by user_id ) f Where last_txn = ‘2020-04-21’ Order by streak_days desc limit 100 ; |
Data Scientist at Facebook was asked...
If 70% of Facebook users on iOS use Instagram, but only 35% of Facebook users on Android use Instagram, how would you investigate the discrepancy? 9 AnswersLook at the marketing and SEO of Instagram on Android vs iOS, look at the user experience of the Android version vs Instagram, look at the relative user quality on Android vs iOS. The percent fool us. There are probably 10 times more android users in the world than iOS users, so I would question this comparison to start with. The number of users probably outranks in android than iOS. Good points above. A few more things to consider: - iOS vs Android demographic differences (age: Are iOS users younger on average and thus more likely to use Instagram?; geography: How do majority of iOS countries (US/Canada/Japan) differ from Android countries?) - Integration of Instagram within the FB app: Is it the same for both OS? - Android has different OEMs. Does this difference hold for all OEMs? Show more responses - Get data on iOS and Android user interns of: - number of users in each age - Geo locationns - Income - Mostly accessed apps - Time spent on phone - Number of users on iOS and Android The age, mostly accessed apps, geolocations, number of users and time spent on phone could tell us a few things: - Is there a significant age discrepancies? - Are users on iOS and Android focus on different apps? - Does iOS users travel more frequently than Android users on average in terms of distances and time? - Even the percentage tells iOS has higher %, what about number of users? - Time spent on phone: if the 65% Android users and 30% iOS users spend much less time on the phone, that just means these groups of people may not be aware of different apps that are available on their phones - How the users get the app - In what ways do most users on iOS and Android get the app? Through recommendations? Direct download from App store? Or the phone they get come with the apps? - If the iOS phones come with the app, or put instagram on top of the list in recommendations, but the Android does not, then that could be the problem. - Test recommending apps on both iOS and Android and see response rate. - Investigate the user feedbacks - Is it possible that the user experience for using Instagram on iOS and Android is so different? - Or does it simply because running on Instagram on Android devices is a pain in the ass? I would start with overall IOS&Android user numbers to see if they have large discrepancy. Then I'll do segmentation within two devices based on demographic, like mentioned above, male/female, age group, countries, income(??maybe no accurate data), time spent on FB. If the demographic and ratios structure are relatively similar for two devices, then it's probably not because one device type is particularly strong in one dimension. Then I'll focus on the integration/user experience on two devices. Also, if above user analysis shows different user structures, which may shed some lights on the discrepancy, I can still check integration/user experience. a. We are only considering mobile usage? b. Check demographics of Android and iOS user i. Implement 2x2 contingency tables based on age, race, location, occupation cross-sectioned with mobile device to determine if there is a relationship between any mobile device and the other races c. Determine if there is a time pattern, has this always been the case i. Calculate the percent of Facebook Users on iOS that use Instagram and the percent of Facebook Users on Android that use Instagram every 3 months since Instagram was acquired by Facebook until now and determine if there is a consistent pattern. If the % was the same for both mobile devices to a point in time I will look for which new features or new developments that might have occurred at the time to see if this was the cause of the difference between mobile devices. I would also look at the d. What type of actions are being done on Facebook and Instagram by the different users (posting, scrolling, sharing, liking, sending messages, shopping, watching videos) i. Are the actions of people using Facebook the same between mobile devices? (% of posting, % of each type of posting, the % time scrolling, etc.) Determine if the actions of people on Instagram the same between the mobile device? Then determine is there is a difference between these actions. We don't know how the samples were collected and they could suffer from a multitude of biases. Resample, look for regression towards the mean. Could be Simpson's paradox at play. Check the samples for statistical significance. Only once significance is established, move on to deeper analysis, as the other excellent answers in this thread. This question is really asking whether OS is a discriminant variable if we want to predict Instagram usage. It's likely that OS is just a proxy of other factors, such as age, country, education, engagement and etc. A quick way to check this is to run a decision tree and check feature importance. If the tree doesn't split on OS, then we should investigate into the factors that actually matter. If OS turns out to be a significant feature, then we need to investigate the reasons. Other answers have provided some possibilities, such as user experience and SEO. One or more comments have been removed. |
Data Scientist at Facebook was asked...
Write a sorting algorithm for a numerical dataset in Python. 9 Answerslist = ["1", "4", "2", "10", "5"] list = [ int(i) for i in list ] list.sort() print (list) list = ["1", "4", "2", "10", "5"] list = [ int(i) for i in list ] list.sort() print (list) Show more responses Here's a bubble sort: number_set = [1, 4, 3, 7, 3, 4, 5, 2, 8, 4] print "This UNSORTED set has %d elements and is: " % len(number_set), number_set for x in range(len(number_set)): for y in range(x, len(number_set)): if number_set[x] >= number_set[y]: continue else: t = number_set[x] number_set[x] = number_set[y] number_set[y] = t print "swapped and the set is now: ", number_set print "This SORTED set has %d elements and is: " % len(number_set), number_set def mergeSort(lst): split = len(lst) / 2 left = lst[:split] right = lst[split:] if len(left) > 1 or len(right) > 1: left = mergeSort(left) right = mergeSort(right) return merge(left, right) def merge(A, B): i=0 j=0 sorted_list = [] while i < len(A) and j < len(B): if A[i] <= B[j]: sorted_list.append(A[i]) i += 1 else: sorted_list.append(B[j]) j += 1 if i < len(A): sorted_list.extend(A[i:]) elif j < len(B): sorted_list.extend(B[j:]) return sorted_list sorted_list = [ ] loops = len(unsorted_list) i = 0 while i <= loops: next_value = min(unsorted_list) sorted_list.append(unsorted_list.pop(unsorted_list.index(next_value))) i += 1 def mysort(arr): if len(arr) pivot] return mysort(left) + middle + mysort(right) def mysort(arr): if len(arr) pivot] return mysort(left) + middle + mysort(right) def mysort(arr): if len(arr) pivot] return mysort(left) + middle + mysort(right) One or more comments have been removed. |
See Interview Questions for Similar Jobs
- Software Engineer
- Data Analyst
- Quantitative Analyst
- Software Developer
- Analyst
- Consultant
- Business Analyst
- Associate
- Research Scientist
- Intern
- Graduate
- Product Manager
- Technology Analyst
- Senior Software Engineer
- Data Engineer
- Senior Data Analyst
- Senior Associate