Github is the de facto home of open source software on the web. While there are other platforms that provide

similar (and in some cases better) services, it is still the place for some of the largest open source software

projects in the world. You will be creating an account on Github for several reasons:

1. we will be using Github to store, transmit and share homework and lecture notes,

2. we will use Github for at all assignments,

3. at some point in your future, you will very likely be using Github, if for private work for a company or

public work on an open source project.

§ Find ONE Github repository of interest and explore it. There is nothing to turn in for this task

– just begin to explore Github.

(25%) Work with exploring unstructured data with Python and text

In this part, you will explore some text data and get a little familiar with Python’s parsing and text capabilities.

You will grab data from the free books provided online from Project Gutenberg and use the provided code

to compare these documents. Turn in the answers to the given tasks after studying the code provided in


gutenberg_get_words() which takes a url for a book on Project Gutenberg and returns a list of the words

in the book. Notice the stopwords= parameter is used to eliminate words that relay low or no information.

This is a common technique use in text processing.

These five books will be used for the tasks for this question:

Book URL

The Prince, Machiavelli [url removed, login to view]

Frankenstein; Or, The Modern

Prometheus by Mary

Wollstonecraft Shelley

[url removed, login to view]

Siddhartha by Hermann Hesse [url removed, login to view]

The Republic by Plato [url removed, login to view]

The Federalist Papers by

Alexander Hamilton, John Jay,

and James Madison

[url removed, login to view]

import requests

import re

US_STOPWORDS = ["a", "about", "above", "above", "across", "after", "afterwards", "again", "against", "all", def gutenberg_get_words(url="[url removed, login to view]",

range=slice(0,None), stopwords=[]):

r = [url removed, login to view](url)

data = [url removed, login to view](r"[^\w\s]", "", str([url removed, login to view])).lower()

return \

[w for w in [url removed, login to view]() if w not in stopwords]

words = gutenberg_get_words(

"[url removed, login to view]",



['london', 'i', 'walk', 'streets', 'petersburgh', 'i', 'feel', 'cold', 'northern', 'breeze', 'play', 'cheeks§ submit the Python code that does the following:

• using the code and the 5 books provided above, explore and apply the very nice Python library called

collections. Use the Counter class to load the word frequencies of each book into a Python dictionary.

• NOTE: you will need to be online with an internet connection for this to work, since it loads the data

directly from the URLs of the books.

§ turn in at least 2 sentences and any code if you used code to answering the following:

• there are similarities and differences in the top 30 words of the five provided documents – be specific

about describing what they are? How similar or different are each of the top 30 words list? You can

compare them by hand (look at them) or you are encouraged to write Python code to compare them

Færdigheder: Datasøgning, Python

Se mere: write book or ebook, write tagline or slogan, write up or article about mothers, look and write a or b, can you write content or articles we need you other, write apostcard or email to friend say what you have been doing and what the place is like use words like first and so next then, alcohol is advertised in media yes or no write essay about your opinion, advances in modern medicine you will have to write a television or radio news special concerning a recent advancement in medicin, write an android application (pdf generation using itext or droidtext, write a c++ program to accept 10 or more numbers then: display the numbers in two columns: one column with the numbers as they w, write a c program to accept 10 or more numbers then display the numbers in two columns one column with the numbers as they were , i need to write some simple databases in microsoft access or file maker pro, help to write to write an effective opinion editorial piece or letter, defun hello write string mary jane doe, write modern english

Om arbejdsgiveren:
( 1 bedømmelse ) Dallas, United States

Projekt-ID: #15123675

6 freelancers are bidding on average $37 for this job


Hi, sir! I have a close look to your project. I have a good skill in python programming. If you award this project to me, we'll complete it in time. Our budget may be negotiable Thanks

$55 USD på 1 dag
(24 bedømmelser)

A proposal has not yet been provided

$50 USD in 2 dage
(30 bedømmelser)
$25 USD på 1 dag
(5 bedømmelser)
$25 USD på 1 dag
(4 bedømmelser)

Feel fee to contact me [url removed, login to view] me message to discuss further more details .We provide the comments,images,videos,demos and live sessions in order to help the [url removed, login to view] payment only after the work [url removed, login to view] yo Mere

$45 USD på 1 dag
(3 bedømmelser)

A proposal has not yet been provided

$20 USD på 1 dag
(0 bedømmelser)