Coursera R Programming Assignment 1 Solution Group

In this introduction to R, you will master the basics of this beautiful open source language, including factors, lists and data frames. With the knowledge gained in this course, you will be ready to undertake your first very own data analysis. With over 2 million users worldwide R is rapidly becoming the leading programming language in statistics and data science. Every year, the number of R users grows by 40% and an increasing number of organizations are using it in their day-to-day activities. Leverage the power of R by completing this free R online course today!

  • In this chapter, you will take your first steps with R. You will learn how to use the console as a calculator and how to assign variables. You will also get to know the basic data types in R. Let's get started!

  • In this free R course, we'll take you on a trip to Vegas, where you will learn how to analyze your gambling results using vectors in R! After completing this chapter, you will be able to create vectors in R, name them, select elements from them and compare different vectors.

  • In this chapter you will learn how to work with matrices in R. By the end of the chapter, you will be able to create matrices and to understand how you can do basic computations with them. You will analyze the box office numbers of Star Wars to illustrate the use of matrices in R. May the force be with you!

  • Very often, data falls into a limited number of categories. For example, human hair color can be categorized as black/brown/blonde/red/grey/white (and perhaps a few more options for people who dye it). In R, categorical data is stored in factors. Given the importance of these factors in data analysis, you should start learning how to create, subset and compare them now!

  • Most data sets you will be working with will be stored as data frames. By the end of this chapter focused on R basics, you will be able to create a data frame, select interesting parts of a data frame and order a data frame according to certain variables.

  • Lists, as opposed to vectors, can hold components of different types, just like your to-do list at home or at work. This intro to R chapter will teach you how to create, name and subset these lists.

  • This lecture is about getting help.

    This lecture applies both to this cla-, this course

    that you're taking right now, the Data Scientist Toolbox.

    But also to all the other courses you're going to be taking in the course track.

    So, keep in mind that in a standard class you may have taken in a

    class of 30 or 100 people, you would raise your hand and ask a question.

    And then you'd be able to immediately get feedback from your instructor.

    But in a class like this in a massive online open

    class there could be up to a 100,000 people taking the class.

    And what you're going to do instead is post your questions to the message board.

    And then hopefully, your fellow students

    will upload them if they're good questions.

    And you instructor will try to respond to as many as possible, but

    probably more often than that your peers or community TAs will be responding.

    And so, there are three of us that are teaching these nine classes and we

    are going to try to put in as much as we can to answer your questions.

    But obviously, that's a limited resource.

    And so relying on your fellow peers and your community

    TAs, we found is a great way to get involved.

    We've also learned that the community that's built around the

    message boards and the massive online open courses is amazing.

    And it's a, probably the best learning part of the entire experience.

    And so hopefully, you'll get involved and you'll

    be an active participant in those message boards.

    It's very clear that the fastest answer is often the one that you find for yourself.

    So to try to answer your questions yourself, you should try to

    look it up on Google or look it up on Stack Overflow.

    If you ask a question that's very simple to Google, you'll

    often a get response that says Google it or read the documentation.

    Which is not the easiest way to get the answer that you're going for.

    An important part of being an active participant

    in a community environment here is to, if

    you figure out an answer to a question is to post it to the message board.

    If you're struggling with a particular part

    or structure or idea or art programming exercise.

    It's almost a sure bet that there's a lot

    of other people that are struggling with the same thing.

    And so, they'll really appreciate it if you take the time to post

    the message board the way that you figured out how to solve that problem.

    So, I thought I'd mention just a few important R functions that will

    help you to find answers for some of the questions you might have.

    So, when you have an R function, we'll talk a

    little bit more about R more later in the class.

    You can actually type several different ways you can

    type to get the help file for that function.

    So, one example is that you can type like this.

    You can type ?rnorm and that will tell you

    what the help file is for the function rnorm.

    You can also search like this help.search.

    And if you use help.search, you might not even

    necessarily have to get the function name exactly right.

    It'll still search through, through the help files and try to find things for you.

    And then, if you want to get the arguments for a function,

    you can use the function, you can use, the, command args, like this.

    Args of rnorm and that'll tell you the function arguments.

    These functions are very useful if your goal is to try

    to figure out how r is working for a particular function.

    But it might not be so useful, if you want

    to understand the sort of underlying concepts involved in those functions.

    So another thing is you might want to do is

    actually look a little bit deeper into the code.

    So if you wanted to do that you can actually just type the function

    name without any brackets and it will

    actually reproduce the entire code for you.

    And so what you see here if I type rnorm like this.

    Then what I end up getting out on the R console is actually this right here.

    I get out sort of all the code that corresponds to that function.

    You could also see this link here to a

    reference card with a lot of helpful R functions.

    So, an important point that you'll run into a lot

    in this class is how to ask an R question.

    And so there are a few different components

    of it that you should keep in mind.

    First is, you will want to outline what are the steps

    that you have executed in order to create this problem.

    So, if you ran three functions in order

    you should reproduce what those three functions are.

    And then you should say what you expect the output to be.

    And then what you saw instead.

    So I expected it to give me the answer

    to this question and instead, it gave me an error.

    And so a really important thing to keep in mind is that R packages and R and all

    of these other tools that we're going to be

    telling you about are going to be evolving over time.

    And so it's really important that you tell

    the version of the product that you're using.

    So, the version of the package, the version of R

    that you're using and then what operating system you're working on.

    Whether you're on Mac or Linux or Windows.

    When you're asking a data analysis question, there's a similar

    set of things that you need to re, re, report.

    So first is what is the question you are trying to answer.

    You're saying I'm trying to relate variable y to variable x.

    And then, what steps or tools do you use to answer it?

    This may be a combination of R tools and outside tools and maybe some intuition.

    And then, you again, you report what you expected to see.

    I expected to be able to tell the

    relationship between them and what do I see instead?

    I see oh, I don't know, I see some

    crazy scatter plot and I don't know what that means.

    And so important thing to think, keep in mind here

    too is what other solutions you might have thought about.

    So sometimes you run through three or four

    different things to try to get the right answer.

    And so, if you're report what you try or the different things you try, there

    when people try to answer your question, they

    can go directly to something you haven't tried.

    So an important point of asking questions in

    highly massive class like this is to make sure

    that you're very specific in the titles of

    the questions that you're using on the message forum.

    So some examples of bad titles are things like this.

    So you can say, Help!

    I can't fit a linear model.

    Then you're not exactly giving a lot of detail as

    what exactly your problem is or how it can be addressed?

    So, a better question to ask is, sort of saying, okay, I

    have this function and it's happening in that version of R 2.15.

    And here is the error that's being produced.

    It's a seg fault that's being produced.

    And is only being produced when I have a

    large data set and here's the software that I'm using.

    I'm using Mac OS X 10.6.3.

    And even better question is to use a title that's a little bit more succinct.

    So, here you lead off again, the function that you're asking about, you

    say okay, I'm asking about R 2.15 and again it's on this operating system.

    And then I very succinctly describe seg fault on large data frame.

    So by focusing on the very specific details, it means people

    can jump very quickly to the answers that you might need.

    So there's similar sorts of questions, specific details you would

    want to give when asking questions about data analysis problems.

    So, in general, the more specific you are the faster your answer will come.

    So there's some etiquette that we would like

    to encourage in terms of using these forums.

    Or in, in just using help sites in general not necessarily the ones in these forums.

    So, again, describe the goal that you have.

    What's the question you're trying to answer?

    Be very explicit.

    Try to provide the minimum amount of information.

    If you, you provide way to much information it's very hard for

    people to filter through and figure out what their real problem is.

    Being polite never hurt anybody and will often get your answer more quickly.

    And then follow up and post solutions.

    So if you post a question and somebody ends up giving

    you the answer on Stack Overflow instead of on the course website.

    It's the polite thing to do, post that you found on the course

    website so that people can search it and find that answer as well.

    Please, please use the forums rather than using personal emails.

    We are very excited about trying to help you learn about data science.

    But it's very easy to overwhelm the inboxes of your

    instructors or community TAs if you all start sending emails simultaneously.

    When there's a typo in the assignment, please report it on the

    forums and we will address it as fast as we possibly can.

    Some things that you shouldn't necessarily do are immediately

    assume you found a bug in a major program.

    So, saying you found a bug in R and that's why things aren't working.

    Groveling as a substitute for your work is obviously not a great thing.

    So begging other people to do your work for you.

    Please don't post homework questions on mailing lists or on the course forums.

    If you post the questions or the answers on the forums

    it, it sort of takes away from the experience of everybody else.

    And then, you don't want to ask general data analysis questions on R forums.

    Those are often redirected back to courses.

    So try to keep those who are R course forums, where hopefully there'll

    be a big group of interested people all trying to answer the same questions.

    So the transfer of these slides go to

    Roger Payne who's another instructor in the course track.

    He has these getting help videos.

    That's a link to his video on YouTube and it was

    inspired by Eric Raymond's lecture, How to ask questions the smart way.

    Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *