Fix exam wrong answer in DA0101EN
Question 20 asks what you should do if a model gives R^2 = 1 on train set and R^2 = 0 on validation data. There are 3 options, I tried to answer with 2 submissions and the obviously wrong answer (model works fine on test data so it has no issues) is the only remaining option.
Please fix this as some people could feasibly need this one question to pass the course.
Hello Julian Im not sure whats wrong with the question
-
Anonymous commented
I think it is wrong too. R^2 = 1 could mean there's a problem but not necessarily. It is really badly worded, as are many questions/answers in this test.
-
Marcos commented
Which of the following statements are true of the Resilient Distributed Dataset (RDD)? Select all that apply.
RDDs allow Spark to reconstruct transformations.
RDDs only add a small amount of code due to tight integration.
RDD is a distributed collection of elements parallelized across the cluster.This is the correct one.
-
Ali_Tailor commented
I selected 4th and fifth and it was wrong, and again I selected i and Fifth and FINAL check and again wrong RDD have 2 type operation only Transformation(Not return value) and Action (Return value)
So I believe the correct option may be:
2- RDDs allow Spark to reconstruct transformations.
5- RDD is a distributed collection of elements parallelized across the cluster.
-
C.A. commented
Hi Joseph,
A number of exam takers, including myself, think the correct answer is answer 3,however answer 3 on question 20 receives immediate feedback of "incorrect".
Several others who took the exam commented on this here https://courses.cognitiveclass.ai/courses/course-v1:CognitiveClass+DA0101EN+2017/discussion/forum/course/threads/5bfd72a9b88a9d0690000412 .
If #3 is not the correct answer, can someone please provide an explanation of what is the correct answer and why, please ? -
Srinivasan V Subramaniam commented
Graded Review answer in BD0211EN
Spark Fundamentals I Cognitive Class BD0211EN
Introduction to Spark - Part 1
Review Question 3 (1 point possible)Which of the following statements are true of the Resilient Distributed Dataset (RDD)? Select all that apply.
I selected the following
RDD is a distributed collection of elements parallelized across the cluster.
Why is is Wrong ?
-
Julian commented
There is nothing wrong with the question, but I suspect the answer is wrong. I no longer have access to the exam but the answers were something a long these lines.
1. Do nothing, the model works perfectly on the training data
2. Change model, as it is currently underfitted
3. Change model as it is currently overfitted
My first choice was #2, then with my second submission I tried option 3, and both times it was counted as incorrect. Option 1 was the only option left so it would have to be the right answer, however if model performs poorly on a test set, then we can't possibly accept it to be correct.