Clean Wiki Ratings¶

This notebook shows the process of cleaning the Wikipedia ratings data, downloaded from:

https://ckannet-storage.commondatastorage.googleapis.com/2012-10-22T184507/aft4.tsv.gz

The data comes in a database-like format where each rating has several rows, one for each dimension of the rating. The dimensions are (according to the data dictionary:

1: trustworthy, 2: objective, 3: complete, 4: well-written

(Abbreviated here: trust, obj, comp, writ)

At the end of this file, we want a dataset that has one row for each rating. It should include:

The page_id, so we can connect it to other page info from Wikipedia's database
The rev_id (revision ID), so we can connect it to information about that particular version of the page, and thus compare it to other versions of the page (the changes from one version to another being, of course, edits)
The rating_values for each of the four dimensions (*_rating)
A datetime field, which will be a Pandas-Datetime version of the timestamp variable, indicating when (in time, as opposed to in the page history) a rating was made
A logged_in field indicating whether the rater was logged in or not (based on the anonymized version of user_id included in the data
A page_title field, which is just nice to have

We will also compute some of our own metrics, namely:

page_rate_count - the number of ratings for each page (also versions for each dimension)
rev_rate_count - the number of ratings for each version (i.e. revision) of each page (also versions for each dimension)
rating_dim_count - the number of dimensions rated for each rating (0-4)
rating_any_mean - the mean of all rated dimensions for each rating, regardless of how many are missing
rating_all_mean - the mean of all rated dimensions only when all dimensions are nonmissing
page_rate_mean_* - the mean for each page of all ratings of that page, in each dimension
rev_rate_mean_* - the mean for each page version of all ratings of that version, in each dimension

import pandas as pd
import numpy as np

Load in the Data

We leave it gzipped because it is quite large.

df = pd.read_csv('data/aft4.tsv.gz', compression='gzip', delim_whitespace=True)
df = df[df.page_namespace==0] # only want main wiki pages
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 46545752 entries, 0 to 47207443
Data columns (total 8 columns):
timestamp         object
page_id           int64
page_title        object
page_namespace    float64
rev_id            float64
user_id           float64
rating_key        object
rating_value      object
dtypes: float64(3), int64(1), object(4)

/Users/bkarfunkel/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py:1070: DtypeWarning: Columns (7) have mixed types. Specify dtype option on import or set low_memory=False.
  data = self._reader.read(nrows)

Deal with File Malformation

The TSV file is actually malformed. timestamp should be an int64, but it's an object because a number of rows have string values.

df['int_timestamp'] = df.timestamp.convert_objects(convert_numeric=True)
df[df.int_timestamp.isnull()]

Judging by the fact that user_id == 1 here, I suspect that when the data was created, there was an issue with logged in users' user names (which have since been scrubbed for anonymity) being stored in a way that went across columns, causing other columns to be pushed around. To check this, we can look at the original file:

f = gzip.open('data/aft4.tsv.gz','rb')
for i in range(0,50):
    print "%s: %s" % (i, f.readline()),

Results (last lines, line numbers inserted by me):

44: 20110722000020    1148043    New_Zealand_cuisine    0    437274630    0    4    5
45: Brockway    12707285    Wild_Bill_Hickok_–_Davis_Tutt_shootout    0    440740702    1    20110722000021    1
46: Brockway    12707285    Wild_Bill_Hickok_–_Davis_Tutt_shootout    0    440740702    1    20110722000021    2
47: Brockway    12707285    Wild_Bill_Hickok_–_Davis_Tutt_shootout    0    440740702    1    20110722000021    3
48: Brockway    12707285    Wild_Bill_Hickok_–_Davis_Tutt_shootout    0    440740702    1    20110722000021    4
49: 20110722000022    32490439    NULL    NULL    440744849    1    1    0

Since the actual rating data is overwritten, there's no way to reconstruct these records, so we have to drop them. First, though, let's make sure they all have user_id == 1:

df[df.int_timestamp.isnull()][df.user_id!=1]

/Users/bkarfunkel/anaconda/lib/python2.7/site-packages/pandas/core/frame.py:1686: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  "DataFrame index.", UserWarning)

Now let's drop these obsevations.

df = df[df.int_timestamp.notnull()]
df.drop('int_timestamp', 1)
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 46353780 entries, 0 to 47207443
Data columns (total 9 columns):
timestamp         object
page_id           int64
page_title        object
page_namespace    float64
rev_id            float64
user_id           float64
rating_key        object
rating_value      object
int_timestamp     float64
dtypes: float64(4), int64(1), object(4)

Ok, but now there are still some problematic cases. timestamp should always be a 14-digit number.

df[df.timestamp.map(lambda x: len(str(x))<14)]

And, rating_key should never be greater than 4.

df.rating_key = df.rating_key.convert_objects(convert_numeric=True)
df[df.rating_key > 4]

Let's drop these cases.

df = df[df.timestamp.map(lambda x: len(str(x)) == 14)][df.rating_key <= 4]
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 46343728 entries, 0 to 47207443
Data columns (total 9 columns):
timestamp         object
page_id           int64
page_title        object
page_namespace    float64
rev_id            float64
user_id           float64
rating_key        float64
rating_value      object
int_timestamp     float64
dtypes: float64(5), int64(1), object(3)

Ok, so we dropped around 10,000 cases but we're still left with...EGAD. 46+ million rows! Well, the data is structured so that each rating is 4 separate rows, but still. Let's take a look at the data, first.

df.head(10)

Ok. So here we can see that each page is rated in groups of four observations. Here is the data dictionary:

The following fields are included:

timestamp: 14-digit timestamp (UTC)

page_id: article ID

page_title: full article title

page_namespace: article namespace (0 is for Wikipedia's main article namespace)

rev_id: revision ID (article version that was rated)

user_id: boolean field set to 0 for anonymous raters and to 1 for registered users

rating_key: numeric key of the rating dimension (1: trustworthy, 2: objective, 3: complete, 4: well-written)

rating_value: rating value on a 1-5 scale. A value set to 0 indicates that the rater didn't rate the article on the corresponding dimension.

Some things to note:

The user_id doesn't contain an actual identifier; it's just a flag for whether the editor was registered.
The timestamp is the time that the rating was made.
For each rating, there are four rows, one for each rating_key (i.e. each rating dimension). (More on this in a bit)

Sample Data

So, now that we have better idea of our data, let's start by cutting down the size of the file to a reasonable level. We'll take a sample of the observations. However, we can't just take a random chunk of rows -- then we would overcount pages with lots of edits, not to mention the fact that we wouldn't get all the data about those pages; in fact, what we want is a random sample of pages.

page_ids = pd.unique(df.page_id) # unique page_ids
np.random.seed(2014) # set the seed
sample_pids = np.random.choice(page_ids, 10000, replace=False) # sample of 10,000 pages

# Now we'll cut the data to only ratings for those pages
s = df[df.page_id.isin(sample_pids)]
s.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 314100 entries, 888 to 47206771
Data columns (total 9 columns):
timestamp         314100 non-null object
page_id           314100 non-null int64
page_title        314100 non-null object
page_namespace    314100 non-null float64
rev_id            314100 non-null float64
user_id           314100 non-null float64
rating_key        314100 non-null float64
rating_value      314100 non-null object
int_timestamp     314100 non-null float64
dtypes: float64(5), int64(1), object(3)

OK! Now we're down to about 315k. That's manageable. Note that this means that we have about:

s.timestamp.count() / 4. / len(pd.unique(s.page_id)) # N, divided by 4 rows per rating, divided by number of pages in sample =

7.8525

7.85 ratings per page. Which is pretty good, as we'll be able to trace them over time. (We'll deal with the distribution of obs./page later.)

Prepare for Reshaping

Now what we want is a dataset that contains one observation per rating. But how do we identify a particular rating? What's our unique identifier?

timestamp and rev_id, of course!...Right? It would make sense, assuming that two people didn't rate the same article at exactly the same moment. But let's make sure.

del(df) # first let's unload df, since it's taking up a lot of RAM
s = s.drop('page_namespace', 1) # drop page_namespace, since we don't need it any more
s['obs_count'] = s.groupby(['rev_id', 'timestamp']).rating_value.transform('count')
s.head(20)

So if all ratings have 4 rows, then there should be no observations where s.count != 4:

(s.obs_count != 4).sum()
s[s.obs_count != 4].head(20)

Ah! So here we see that, in fact, some ratings were saved to the database across seconds, meaning that not all ratings have the same timestamp. Bummer.

Since we can't use timestamp as an ID variable, lets just use the rev_id combined with sorting by the index. For each rev_id group, we'll construct a variable equal to 1 for the first foursome (i.e. the first rating), 2 for the second, etc.

# Grouping by rev_id, create the cumulative count (i.e. 0 for first obs with that rev_id, 1 for second, etc.)
# Then, group by fours, and add one (so that the first is 1, the second 2, etc.)
s['nth_rating'] = s.groupby('rev_id').cumcount().apply(lambda x: int(x / 4) + 1)

Let's take a look again at our sample:

s.sort(['rev_id', 'nth_rating', 'rating_key'])

Now nth_rating, when combined with rev_id and rating_key, should give us a unique ID for each rating (which we'll need when we want to reshape the data).

s.duplicated(['rev_id', 'nth_rating', 'rating_key']).sum()

4

Uh oh. Let's view those 4 cases:

s[s.duplicated(['rev_id', 'nth_rating', 'rating_key'])]

s[(s.rev_id == 434558075) & (s.nth_rating.isin([3,4]))]

Here it appears that the same rating was inserted twice into the database at exactly the same time. Note that the timestamps are identical, as, indeed, are all fields. Thus, this appears to be a genuine duplicate, which we should drop one of. Because of the grouping of nth_rating, we'll need to change that as well.

s.info()
s.drop_duplicates().info()
s = s.drop_duplicates()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 314100 entries, 888 to 47206771
Data columns (total 10 columns):
timestamp        314100 non-null object
page_id          314100 non-null int64
page_title       314100 non-null object
rev_id           314100 non-null float64
user_id          314100 non-null float64
rating_key       314100 non-null float64
rating_value     314100 non-null object
int_timestamp    314100 non-null float64
obs_count        314100 non-null object
nth_rating       314100 non-null int64
dtypes: float64(4), int64(2), object(4)<class 'pandas.core.frame.DataFrame'>
Int64Index: 314096 entries, 888 to 47206771
Data columns (total 10 columns):
timestamp        314096 non-null object
page_id          314096 non-null int64
page_title       314096 non-null object
rev_id           314096 non-null float64
user_id          314096 non-null float64
rating_key       314096 non-null float64
rating_value     314096 non-null object
int_timestamp    314096 non-null float64
obs_count        314096 non-null object
nth_rating       314096 non-null int64
dtypes: float64(4), int64(2), object(4)

So, we only dropped the 4 observations.

s.nth_rating[(s.rev_id == 434558075) & (s.nth_rating.isin([3,4]))] = 3
s[(s.rev_id == 434558075)]

Finally, to make it easier to deal with the index and reshaping, let's convert the rating keys to meaningful names.

s.rating_key = s.rating_key.replace([1, 2, 3, 4], ['trust','obj','comp','writ'])

Reshape Data

Now we're ready to reshape. We start by dropping variables we no longer need. Next, we set the index to all the variables we don't want to be reshaped (i.e. that don't change for our new observations), with the variable we want to reshape on, rating_key, last.

s = s.drop(['int_timestamp', 'obs_count'], 1)
s = s.set_index(['rev_id', 'nth_rating', 'page_id', 'page_title', 'user_id', 'rating_key'])

Now we do the actual reshaping, using unstack.

stacked = s # create a copy of the stacked version, for convenience

#s = stacked # to reset

s = s.unstack(level='rating_key')

Create Additional Fields

First, let's create a datetime variable with a date object for the first timestamp. First, we create a series, which we will pd.concat with the rest of the data later.

# Create a series of datetimes
datetime = pd.to_datetime(s.timestamp.comp) # "complete" is the first of the four rating_keys
datetime.name = 'datetime'

Now that we have the datetime (i.e. the information we care about from timestamp), the only other data we want to keep is either already in the index, or else is in rating_value. Since we don't want to keep a MultiIndex if we don't need it (it makes accessing columns a pain), we can just set our data to be s.rating_value (which has all the other things we care about in the index), and add all the other measures to that.

rating_values = s.rating_value

Before we can mean ratings, etc., we need to clean up the rating_value fields. In particular, we need to:

Convert values to numeric (from object)
Replace 0s with NaN (i.e. missing or NA), since, as the data dictionary states: "A value set to 0 indicates that the rater didn't rate the article on the corresponding dimension."

rating_values = rating_values.convert_objects(convert_numeric=True) # convert rating value columns to numeric

rating_values = rating_values.replace(0, np.nan)
rating_values.head(20)

If all the ratings are zero/missing, then we obviously can't get much information from the rating, so drop these cases.

print("BEFORE: %d cases with all missing rating values" % rating_values.isnull().all(axis=1).sum())
rating_values.dropna(how='all', inplace=True) # drop in place if _all_ values are NA
print("AFTER: %d cases with all missing rating values" % rating_values.isnull().all(axis=1).sum())

BEFORE: 787 cases with all missing rating values
AFTER: 0 cases with all missing rating values

Now we set s to be rating_values, which contains in index the other info we want, concatenated with the datetime series we created earlier.

s = pd.concat([pd.DataFrame(datetime), rating_values], axis=1)
s.info()
s.head()

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 78524 entries, (132017785.0, 1, 7688277, Information_Harvesting, 0.0) to (503626343.0, 1, 3741746, Emma_Stone, 0.0)
Data columns (total 5 columns):
datetime    78524 non-null datetime64[ns]
comp        60263 non-null float64
obj         57496 non-null float64
trust       61993 non-null float64
writ        65568 non-null float64
dtypes: datetime64[ns](1), float64(4)

Create a variable counting the number of dimensions for each rating.

s['rating_dim_count'] = rating_values.count(axis=1)
s.head()

Create two variables with the mean of all rating dimensions:

rating_any_mean -- mean of all dimensions, regardless of how many missing dimensions
rating_all_mean -- mean of all dimensions only when all dimensions are nonmissing

s['rating_any_mean'] = rating_values.mean(axis=1)
s['rating_all_mean'] = rating_values[rating_values.notnull().all(axis=1)].mean(axis=1)
s.head(20)

We're about ready to reset_index so we can use groupby to create revision- and page-wide statistics, but before we do, let's create an array of all the rating value dimensions.

rating_dims = rating_values.columns

Now we can reset_index and delete rating_values.

del(rating_values)
s = s.reset_index()

Rename user_id to logged_in, which is more clear for what the variable is actually measuring.

s.rename(columns={'user_id': 'logged_in'}, inplace=True)

Create count variables for the number of ratings for each page and version

page_groups = s.groupby(['page_id'])

s['page_rate_count'] = page_groups.page_id.transform('count')

rev_groups = s.groupby(['rev_id'])

s['rev_rate_count'] = rev_groups.rev_id.transform('count')

And create a mean within each page/revision for each of the four dimensions.

for var in rating_dims:
    s['page_%s_mean' % var] = page_groups[var].transform('mean')
    s['rev_%s_mean' % var] = rev_groups[var].transform('mean')

And now we're almost done! We just need to sort it, and save.

s = s.sort(['page_id', 'rev_id', 'datetime'])

s.to_csv("cleaned_wiki_ratings.csv")

	timestamp	page_id	page_title	page_namespace	rev_id	user_id	rating_key	rating_value	int_timestamp
44	Brockway	12707285	Wild_Bill_Hickok_–_Davis_Tutt_shootout	0	440740702	1	20110722000021	1	NaN
45	Brockway	12707285	Wild_Bill_Hickok_–_Davis_Tutt_shootout	0	440740702	1	20110722000021	2	NaN
46	Brockway	12707285	Wild_Bill_Hickok_–_Davis_Tutt_shootout	0	440740702	1	20110722000021	3	NaN
47	Brockway	12707285	Wild_Bill_Hickok_–_Davis_Tutt_shootout	0	440740702	1	20110722000021	4	NaN
1312	Agent99	2027638	Alabina	0	440746057	1	20110722000952	1	NaN
1313	Agent99	2027638	Alabina	0	440746057	1	20110722000952	2	NaN
1314	Agent99	2027638	Alabina	0	440746057	1	20110722000952	3	NaN
1315	Agent99	2027638	Alabina	0	440746057	1	20110722000952	4	NaN
1336	Christensen	15240614	Port_of_Miami_Tunnel	0	440745935	1	20110722000957	1	NaN
1337	Christensen	15240614	Port_of_Miami_Tunnel	0	440745935	1	20110722000957	2	NaN
1338	Christensen	15240614	Port_of_Miami_Tunnel	0	440745935	1	20110722000957	3	NaN
1339	Christensen	15240614	Port_of_Miami_Tunnel	0	440745935	1	20110722000957	4	NaN
3988	Asian	27423599	Time's_List_of_the_100_Best_Novels	0	434945513	1	Guy	20110722002902	NaN
3989	Asian	27423599	Time's_List_of_the_100_Best_Novels	0	434945513	1	Guy	20110722002902	NaN
3990	Asian	27423599	Time's_List_of_the_100_Best_Novels	0	434945513	1	Guy	20110722002902	NaN
3991	Asian	27423599	Time's_List_of_the_100_Best_Novels	0	434945513	1	Guy	20110722002902	NaN
4836	dreamer	28388618	Da_Vinci_Learning	0	438423151	1	20110722003455	1	NaN
4837	dreamer	28388618	Da_Vinci_Learning	0	438423151	1	20110722003455	2	NaN
4838	dreamer	28388618	Da_Vinci_Learning	0	438423151	1	20110722003455	3	NaN
4839	dreamer	28388618	Da_Vinci_Learning	0	438423151	1	20110722003455	4	NaN
5100	Billy	4689209	Soccer_in_Australia	0	440617748	1	20110722003625	1	NaN
5101	Billy	4689209	Soccer_in_Australia	0	440617748	1	20110722003625	2	NaN
5102	Billy	4689209	Soccer_in_Australia	0	440617748	1	20110722003625	3	NaN
5103	Billy	4689209	Soccer_in_Australia	0	440617748	1	20110722003625	4	NaN
5132	dreamer	232026	Chiang_Mai	0	438450083	1	20110722003638	1	NaN
5133	dreamer	232026	Chiang_Mai	0	438450083	1	20110722003638	2	NaN
5134	dreamer	232026	Chiang_Mai	0	438450083	1	20110722003638	3	NaN
5135	dreamer	232026	Chiang_Mai	0	438450083	1	20110722003638	4	NaN
5420	Sublette	1481631	John_A._Stormer	0	430247366	1	20110722003828	1	NaN
5421	Sublette	1481631	John_A._Stormer	0	430247366	1	20110722003828	2	NaN
5422	Sublette	1481631	John_A._Stormer	0	430247366	1	20110722003828	3	NaN
5423	Sublette	1481631	John_A._Stormer	0	430247366	1	20110722003828	4	NaN
5664	ashton	1611646	Ashton_United_F.C.	0	440488396	1	20110722003953	1	NaN
5665	ashton	1611646	Ashton_United_F.C.	0	440488396	1	20110722003953	2	NaN
5666	ashton	1611646	Ashton_United_F.C.	0	440488396	1	20110722003953	3	NaN
5667	ashton	1611646	Ashton_United_F.C.	0	440488396	1	20110722003953	4	NaN
7296	Andres	20647537	Emily_Browning	0	440188744	1	Gomez	Martinez	NaN
7297	Andres	20647537	Emily_Browning	0	440188744	1	Gomez	Martinez	NaN
7298	Andres	20647537	Emily_Browning	0	440188744	1	Gomez	Martinez	NaN
7299	Andres	20647537	Emily_Browning	0	440188744	1	Gomez	Martinez	NaN
7932	Ruq	20958639	Trailer_Park_of_Terror	0	404230589	1	20110722005511	1	NaN
7933	Ruq	20958639	Trailer_Park_of_Terror	0	404230589	1	20110722005511	2	NaN
7934	Ruq	20958639	Trailer_Park_of_Terror	0	404230589	1	20110722005511	3	NaN
7935	Ruq	20958639	Trailer_Park_of_Terror	0	404230589	1	20110722005511	4	NaN
8552	Wolfowitz	15952370	Robots_Have_No_Tails	0	440752133	1	20110722005914	1	NaN
8553	Wolfowitz	15952370	Robots_Have_No_Tails	0	440752133	1	20110722005914	2	NaN
8554	Wolfowitz	15952370	Robots_Have_No_Tails	0	440752133	1	20110722005914	3	NaN
8555	Wolfowitz	15952370	Robots_Have_No_Tails	0	440752133	1	20110722005914	4	NaN
8896	Alfarrobinha	18962647	Canadian_heraldry	0	440752086	1	20110722010209	1	NaN
8897	Alfarrobinha	18962647	Canadian_heraldry	0	440752086	1	20110722010209	2	NaN
8898	Alfarrobinha	18962647	Canadian_heraldry	0	440752086	1	20110722010209	3	NaN
8899	Alfarrobinha	18962647	Canadian_heraldry	0	440752086	1	20110722010209	4	NaN
9316	Bratland	3839461	Aptera_Motors	0	440752865	1	20110722010521	1	NaN
9317	Bratland	3839461	Aptera_Motors	0	440752865	1	20110722010521	2	NaN
9318	Bratland	3839461	Aptera_Motors	0	440752865	1	20110722010521	3	NaN
9319	Bratland	3839461	Aptera_Motors	0	440752865	1	20110722010521	4	NaN
10256	Syutfu	32364156	2011_East_Africa_drought	0	440752203	1	20110722011245	1	NaN
10257	Syutfu	32364156	2011_East_Africa_drought	0	440752203	1	20110722011245	2	NaN
10258	Syutfu	32364156	2011_East_Africa_drought	0	440752203	1	20110722011245	3	NaN
10259	Syutfu	32364156	2011_East_Africa_drought	0	440752203	1	20110722011245	4	NaN
	...	...	...	...	...	...	...	...	...

	timestamp	page_id	page_title	page_namespace	rev_id	user_id	rating_key	rating_value	int_timestamp
11128	27	4404343	Ziynet_Sali	0	439459700	1	20110722011928	1	27
11129	27	4404343	Ziynet_Sali	0	439459700	1	20110722011928	2	27
11130	27	4404343	Ziynet_Sali	0	439459700	1	20110722011928	3	27
11131	27	4404343	Ziynet_Sali	0	439459700	1	20110722011928	4	27
24676	09	489856	Valenzuela,_Philippines	0	436564449	1	20110722025521	1	9
24677	09	489856	Valenzuela,_Philippines	0	436564449	1	20110722025521	2	9
24678	09	489856	Valenzuela,_Philippines	0	436564449	1	20110722025521	3	9
24679	09	489856	Valenzuela,_Philippines	0	436564449	1	20110722025521	4	9
25600	20	3152105	Joan_Capdevila	0	440743979	1	20110722030150	1	20
25601	20	3152105	Joan_Capdevila	0	440743979	1	20110722030150	2	20
25602	20	3152105	Joan_Capdevila	0	440743979	1	20110722030150	3	20
25603	20	3152105	Joan_Capdevila	0	440743979	1	20110722030150	4	20
33024	2010	19616015	Creampie_(sexual_act)	0	440681588	1	20110722035733	1	2010
33025	2010	19616015	Creampie_(sexual_act)	0	440681588	1	20110722035733	2	2010
33026	2010	19616015	Creampie_(sexual_act)	0	440681588	1	20110722035733	3	2010
33027	2010	19616015	Creampie_(sexual_act)	0	440681588	1	20110722035733	4	2010
34708	2805	4942332	St_John_Passion	0	431451664	1	20110722040927	1	2805
34709	2805	4942332	St_John_Passion	0	431451664	1	20110722040927	2	2805
34710	2805	4942332	St_John_Passion	0	431451664	1	20110722040927	3	2805
34711	2805	4942332	St_John_Passion	0	431451664	1	20110722040927	4	2805
36024	52	31937595	Americium_smoke_detector	0	439331352	1	20110722041827	1	52
36025	52	31937595	Americium_smoke_detector	0	439331352	1	20110722041827	2	52
36026	52	31937595	Americium_smoke_detector	0	439331352	1	20110722041827	3	52
36027	52	31937595	Americium_smoke_detector	0	439331352	1	20110722041827	4	52
50768	85	148537	General_officer	0	440607407	1	20110722061527	1	85
50769	85	148537	General_officer	0	440607407	1	20110722061527	2	85
50770	85	148537	General_officer	0	440607407	1	20110722061527	3	85
50771	85	148537	General_officer	0	440607407	1	20110722061527	4	85
53280	73	740900	Club_Atlético_River_Plate	0	440774459	1	20110722063345	1	73
53281	73	740900	Club_Atlético_River_Plate	0	440774459	1	20110722063345	2	73
53282	73	740900	Club_Atlético_River_Plate	0	440774459	1	20110722063345	3	73
53283	73	740900	Club_Atlético_River_Plate	0	440774459	1	20110722063345	4	73
73404	28	551710	Jack_Cornwell	0	439120205	1	20110722090836	1	28
73405	28	551710	Jack_Cornwell	0	439120205	1	20110722090836	2	28
73406	28	551710	Jack_Cornwell	0	439120205	1	20110722090836	3	28
73407	28	551710	Jack_Cornwell	0	439120205	1	20110722090836	4	28
78324	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	20110722094348	1	8029
78325	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	20110722094348	2	8029
78326	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	20110722094348	3	8029
78327	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	20110722094348	4	8029
79744	07	11568427	Guru_Ghasidas_University	0	432968172	1	20110722095422	1	7
79745	07	11568427	Guru_Ghasidas_University	0	432968172	1	20110722095422	2	7
79746	07	11568427	Guru_Ghasidas_University	0	432968172	1	20110722095422	3	7
79747	07	11568427	Guru_Ghasidas_University	0	432968172	1	20110722095422	4	7
82660	28	69476	Rear_admiral	0	435052661	1	20110722101354	1	28
82661	28	69476	Rear_admiral	0	435052661	1	20110722101354	2	28
82662	28	69476	Rear_admiral	0	435052661	1	20110722101354	3	28
82663	28	69476	Rear_admiral	0	435052661	1	20110722101354	4	28
85080	28	2388291	Admiral_(Sweden)	0	308999077	1	20110722103116	1	28
85081	28	2388291	Admiral_(Sweden)	0	308999077	1	20110722103116	2	28
85082	28	2388291	Admiral_(Sweden)	0	308999077	1	20110722103116	3	28
85083	28	2388291	Admiral_(Sweden)	0	308999077	1	20110722103116	4	28
85212	28	1452817	Flotilla_admiral	0	384534905	1	20110722103221	1	28
85213	28	1452817	Flotilla_admiral	0	384534905	1	20110722103221	2	28
85214	28	1452817	Flotilla_admiral	0	384534905	1	20110722103221	3	28
85215	28	1452817	Flotilla_admiral	0	384534905	1	20110722103221	4	28
91396	28	170320	Riksdag_of_the_Estates	0	416921168	1	20110722111555	1	28
91397	28	170320	Riksdag_of_the_Estates	0	416921168	1	20110722111555	2	28
91398	28	170320	Riksdag_of_the_Estates	0	416921168	1	20110722111555	3	28
91399	28	170320	Riksdag_of_the_Estates	0	416921168	1	20110722111555	4	28
	...	...	...	...	...	...	...	...	...

	timestamp	page_id	page_title	page_namespace	rev_id	user_id	rating_key	rating_value	int_timestamp
11128	27	4404343	Ziynet_Sali	0	439459700	1	2.011072e+13	1	27
11129	27	4404343	Ziynet_Sali	0	439459700	1	2.011072e+13	2	27
11130	27	4404343	Ziynet_Sali	0	439459700	1	2.011072e+13	3	27
11131	27	4404343	Ziynet_Sali	0	439459700	1	2.011072e+13	4	27
24676	09	489856	Valenzuela,_Philippines	0	436564449	1	2.011072e+13	1	9
24677	09	489856	Valenzuela,_Philippines	0	436564449	1	2.011072e+13	2	9
24678	09	489856	Valenzuela,_Philippines	0	436564449	1	2.011072e+13	3	9
24679	09	489856	Valenzuela,_Philippines	0	436564449	1	2.011072e+13	4	9
25600	20	3152105	Joan_Capdevila	0	440743979	1	2.011072e+13	1	20
25601	20	3152105	Joan_Capdevila	0	440743979	1	2.011072e+13	2	20
25602	20	3152105	Joan_Capdevila	0	440743979	1	2.011072e+13	3	20
25603	20	3152105	Joan_Capdevila	0	440743979	1	2.011072e+13	4	20
33024	2010	19616015	Creampie_(sexual_act)	0	440681588	1	2.011072e+13	1	2010
33025	2010	19616015	Creampie_(sexual_act)	0	440681588	1	2.011072e+13	2	2010
33026	2010	19616015	Creampie_(sexual_act)	0	440681588	1	2.011072e+13	3	2010
33027	2010	19616015	Creampie_(sexual_act)	0	440681588	1	2.011072e+13	4	2010
34708	2805	4942332	St_John_Passion	0	431451664	1	2.011072e+13	1	2805
34709	2805	4942332	St_John_Passion	0	431451664	1	2.011072e+13	2	2805
34710	2805	4942332	St_John_Passion	0	431451664	1	2.011072e+13	3	2805
34711	2805	4942332	St_John_Passion	0	431451664	1	2.011072e+13	4	2805
36024	52	31937595	Americium_smoke_detector	0	439331352	1	2.011072e+13	1	52
36025	52	31937595	Americium_smoke_detector	0	439331352	1	2.011072e+13	2	52
36026	52	31937595	Americium_smoke_detector	0	439331352	1	2.011072e+13	3	52
36027	52	31937595	Americium_smoke_detector	0	439331352	1	2.011072e+13	4	52
50768	85	148537	General_officer	0	440607407	1	2.011072e+13	1	85
50769	85	148537	General_officer	0	440607407	1	2.011072e+13	2	85
50770	85	148537	General_officer	0	440607407	1	2.011072e+13	3	85
50771	85	148537	General_officer	0	440607407	1	2.011072e+13	4	85
53280	73	740900	Club_Atlético_River_Plate	0	440774459	1	2.011072e+13	1	73
53281	73	740900	Club_Atlético_River_Plate	0	440774459	1	2.011072e+13	2	73
53282	73	740900	Club_Atlético_River_Plate	0	440774459	1	2.011072e+13	3	73
53283	73	740900	Club_Atlético_River_Plate	0	440774459	1	2.011072e+13	4	73
73404	28	551710	Jack_Cornwell	0	439120205	1	2.011072e+13	1	28
73405	28	551710	Jack_Cornwell	0	439120205	1	2.011072e+13	2	28
73406	28	551710	Jack_Cornwell	0	439120205	1	2.011072e+13	3	28
73407	28	551710	Jack_Cornwell	0	439120205	1	2.011072e+13	4	28
78324	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	2.011072e+13	1	8029
78325	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	2.011072e+13	2	8029
78326	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	2.011072e+13	3	8029
78327	8029	1587291	Harbour_Line_(Mumbai_Suburban_Railway)	0	395523873	1	2.011072e+13	4	8029
79744	07	11568427	Guru_Ghasidas_University	0	432968172	1	2.011072e+13	1	7
79745	07	11568427	Guru_Ghasidas_University	0	432968172	1	2.011072e+13	2	7
79746	07	11568427	Guru_Ghasidas_University	0	432968172	1	2.011072e+13	3	7
79747	07	11568427	Guru_Ghasidas_University	0	432968172	1	2.011072e+13	4	7
82660	28	69476	Rear_admiral	0	435052661	1	2.011072e+13	1	28
82661	28	69476	Rear_admiral	0	435052661	1	2.011072e+13	2	28
82662	28	69476	Rear_admiral	0	435052661	1	2.011072e+13	3	28
82663	28	69476	Rear_admiral	0	435052661	1	2.011072e+13	4	28
85080	28	2388291	Admiral_(Sweden)	0	308999077	1	2.011072e+13	1	28
85081	28	2388291	Admiral_(Sweden)	0	308999077	1	2.011072e+13	2	28
85082	28	2388291	Admiral_(Sweden)	0	308999077	1	2.011072e+13	3	28
85083	28	2388291	Admiral_(Sweden)	0	308999077	1	2.011072e+13	4	28
85212	28	1452817	Flotilla_admiral	0	384534905	1	2.011072e+13	1	28
85213	28	1452817	Flotilla_admiral	0	384534905	1	2.011072e+13	2	28
85214	28	1452817	Flotilla_admiral	0	384534905	1	2.011072e+13	3	28
85215	28	1452817	Flotilla_admiral	0	384534905	1	2.011072e+13	4	28
91396	28	170320	Riksdag_of_the_Estates	0	416921168	1	2.011072e+13	1	28
91397	28	170320	Riksdag_of_the_Estates	0	416921168	1	2.011072e+13	2	28
91398	28	170320	Riksdag_of_the_Estates	0	416921168	1	2.011072e+13	3	28
91399	28	170320	Riksdag_of_the_Estates	0	416921168	1	2.011072e+13	4	28
	...	...	...	...	...	...	...	...	...

	timestamp	page_id	page_title	rev_id	rating_key	rating_value	int_timestamp
0	20110722000002	29543332	RC_Timişoara	419784624	1	5	2.011072e+13
1	20110722000002	29543332	RC_Timişoara	419784624	2	5	2.011072e+13
2	20110722000002	29543332	RC_Timişoara	419784624	3	5	2.011072e+13
3	20110722000002	29543332	RC_Timişoara	419784624	4	5	2.011072e+13
4	20110722000002	68453	Basset_Hound	439346501	1	5	2.011072e+13
5	20110722000002	68453	Basset_Hound	439346501	2	0	2.011072e+13
6	20110722000002	68453	Basset_Hound	439346501	3	0	2.011072e+13
7	20110722000002	68453	Basset_Hound	439346501	4	0	2.011072e+13
8	20110722000005	180473	Symbolic_link	439383163	1	5	2.011072e+13
9	20110722000005	180473	Symbolic_link	439383163	2	5	2.011072e+13

	timestamp	page_id	page_title	rev_id	user_id	rating_key	rating_value	int_timestamp	obs_count
888	20110722000638	25862453	The_Amazing_Spider-Man_(2012_film)	440732801	0	1	1	2.011072e+13	4
889	20110722000638	25862453	The_Amazing_Spider-Man_(2012_film)	440732801	0	2	1	2.011072e+13	4
890	20110722000638	25862453	The_Amazing_Spider-Man_(2012_film)	440732801	0	3	1	2.011072e+13	4
891	20110722000638	25862453	The_Amazing_Spider-Man_(2012_film)	440732801	0	4	1	2.011072e+13	4
1132	20110722000822	1384672	God_Save_the_South	421037003	1	1	2	2.011072e+13	4
1133	20110722000822	1384672	God_Save_the_South	421037003	1	2	4	2.011072e+13	4
1134	20110722000822	1384672	God_Save_the_South	421037003	1	3	2	2.011072e+13	4
1135	20110722000822	1384672	God_Save_the_South	421037003	1	4	3	2.011072e+13	4
1368	20110722001012	22188070	Monte_Pisanino	418490010	0	1	3	2.011072e+13	4
1369	20110722001012	22188070	Monte_Pisanino	418490010	0	2	2	2.011072e+13	4
1370	20110722001012	22188070	Monte_Pisanino	418490010	0	3	3	2.011072e+13	4
1371	20110722001012	22188070	Monte_Pisanino	418490010	0	4	1	2.011072e+13	4
1980	20110722001444	32330134	Bill_S.978	440742665	0	1	2	2.011072e+13	4
1981	20110722001444	32330134	Bill_S.978	440742665	0	2	4	2.011072e+13	4
1982	20110722001444	32330134	Bill_S.978	440742665	0	3	3	2.011072e+13	4
1983	20110722001444	32330134	Bill_S.978	440742665	0	4	3	2.011072e+13	4
3040	20110722002205	1099515	Squall_Leonhart	434283951	0	1	5	2.011072e+13	4
3041	20110722002205	1099515	Squall_Leonhart	434283951	0	2	5	2.011072e+13	4
3042	20110722002205	1099515	Squall_Leonhart	434283951	0	3	4	2.011072e+13	4
3043	20110722002205	1099515	Squall_Leonhart	434283951	0	4	5	2.011072e+13	4

	timestamp	page_id	page_title	rev_id	user_id	rating_key	rating_value	int_timestamp	obs_count
99976	20110722121723	39184	NTFS	437513502	0	1	0	2.011072e+13	2
99977	20110722121723	39184	NTFS	437513502	0	2	0	2.011072e+13	2
99978	20110722121724	39184	NTFS	437513502	0	3	0	2.011072e+13	2
99979	20110722121724	39184	NTFS	437513502	0	4	4	2.011072e+13	2
127228	20110722145634	23869776	Thirteen_Reasons_Why	440734013	1	1	0	2.011072e+13	1
127229	20110722145635	23869776	Thirteen_Reasons_Why	440734013	1	2	0	2.011072e+13	3
127230	20110722145635	23869776	Thirteen_Reasons_Why	440734013	1	3	0	2.011072e+13	3
127231	20110722145635	23869776	Thirteen_Reasons_Why	440734013	1	4	4	2.011072e+13	3
150074	20110722170052	972095	Lick_(music)	433396260	0	1	5	2.011072e+13	1
150086	20110722170107	19819307	Blacksand	436834538	0	1	1	2.011072e+13	1
150247	20110722170119	19819307	Blacksand	436834538	0	2	1	2.011072e+13	3
150248	20110722170119	19819307	Blacksand	436834538	0	3	1	2.011072e+13	3
150249	20110722170119	19819307	Blacksand	436834538	0	4	1	2.011072e+13	3
150275	20110722170119	972095	Lick_(music)	433396260	0	2	5	2.011072e+13	3
150276	20110722170119	972095	Lick_(music)	433396260	0	3	5	2.011072e+13	3
150277	20110722170119	972095	Lick_(music)	433396260	0	4	5	2.011072e+13	3
192904	20110722210445	386327	Law_School_Admission_Test	439972076	0	1	5	2.011072e+13	1
192905	20110722210446	386327	Law_School_Admission_Test	439972076	0	2	3	2.011072e+13	3
192906	20110722210446	386327	Law_School_Admission_Test	439972076	0	3	5	2.011072e+13	3
192907	20110722210446	386327	Law_School_Admission_Test	439972076	0	4	3	2.011072e+13	3

	timestamp	page_id	page_title	rev_id	user_id	rating_key	rating_value	int_timestamp	obs_count	nth_rating
6321972	20110825164758	7688277	Information_Harvesting	132017785	0	1	0	2.011083e+13	4	1
6321973	20110825164758	7688277	Information_Harvesting	132017785	0	2	0	2.011083e+13	4	1
6321974	20110825164758	7688277	Information_Harvesting	132017785	0	3	1	2.011083e+13	4	1
6321975	20110825164758	7688277	Information_Harvesting	132017785	0	4	0	2.011083e+13	4	1
6606408	20110827162937	988664	Hygd	177504140	0	1	4	2.011083e+13	4	1
6606409	20110827162937	988664	Hygd	177504140	0	2	5	2.011083e+13	4	1
6606410	20110827162937	988664	Hygd	177504140	0	3	3	2.011083e+13	4	1
6606411	20110827162937	988664	Hygd	177504140	0	4	5	2.011083e+13	4	1
21592108	20111208180546	15306408	Northern_Territory_Certificate_of_Education	200528301	0	1	1	2.011121e+13	4	1
21592109	20111208180546	15306408	Northern_Territory_Certificate_of_Education	200528301	0	2	1	2.011121e+13	4	1
21592110	20111208180546	15306408	Northern_Territory_Certificate_of_Education	200528301	0	3	1	2.011121e+13	4	1
21592111	20111208180546	15306408	Northern_Territory_Certificate_of_Education	200528301	0	4	1	2.011121e+13	4	1
6122456	20110824130602	11191944	Methionine_adenosyltransferase	204958352	0	1	1	2.011082e+13	4	1
6122457	20110824130602	11191944	Methionine_adenosyltransferase	204958352	0	2	0	2.011082e+13	4	1
6122458	20110824130602	11191944	Methionine_adenosyltransferase	204958352	0	3	1	2.011082e+13	4	1
6122459	20110824130602	11191944	Methionine_adenosyltransferase	204958352	0	4	0	2.011082e+13	4	1
2852844	20110805002329	637164	Zuph	216938513	0	1	5	2.011081e+13	4	1
2852845	20110805002329	637164	Zuph	216938513	0	2	5	2.011081e+13	4	1
2852846	20110805002329	637164	Zuph	216938513	0	3	4	2.011081e+13	4	1
2852847	20110805002329	637164	Zuph	216938513	0	4	4	2.011081e+13	4	1
1302312	20110728024506	3283625	Course_(medicine)	220802711	0	1	0	2.011073e+13	4	1
1302313	20110728024506	3283625	Course_(medicine)	220802711	0	2	0	2.011073e+13	4	1
1302314	20110728024506	3283625	Course_(medicine)	220802711	0	3	0	2.011073e+13	4	1
1302315	20110728024506	3283625	Course_(medicine)	220802711	0	4	4	2.011073e+13	4	1
5536324	20110820231000	3283625	Course_(medicine)	220802711	0	1	1	2.011082e+13	4	2
5536325	20110820231000	3283625	Course_(medicine)	220802711	0	2	1	2.011082e+13	4	2
5536326	20110820231000	3283625	Course_(medicine)	220802711	0	3	1	2.011082e+13	4	2
5536327	20110820231000	3283625	Course_(medicine)	220802711	0	4	1	2.011082e+13	4	2
108288	20110722130957	6689720	Integrated_bottom_line	229691947	1	1	1	2.011072e+13	4	1
108289	20110722130957	6689720	Integrated_bottom_line	229691947	1	2	1	2.011072e+13	4	1
108290	20110722130957	6689720	Integrated_bottom_line	229691947	1	3	1	2.011072e+13	4	1
108291	20110722130957	6689720	Integrated_bottom_line	229691947	1	4	2	2.011072e+13	4	1
27712704	20120131021928	13485005	Tangipahoa	231669276	0	1	5	2.012013e+13	4	1
27712705	20120131021928	13485005	Tangipahoa	231669276	0	2	5	2.012013e+13	4	1
27712706	20120131021928	13485005	Tangipahoa	231669276	0	3	0	2.012013e+13	4	1
27712707	20120131021928	13485005	Tangipahoa	231669276	0	4	5	2.012013e+13	4	1
35669136	20120404224906	13485005	Tangipahoa	231669276	0	1	4	2.012040e+13	4	2
35669137	20120404224906	13485005	Tangipahoa	231669276	0	2	4	2.012040e+13	4	2
35669138	20120404224906	13485005	Tangipahoa	231669276	0	3	4	2.012040e+13	4	2
35669139	20120404224906	13485005	Tangipahoa	231669276	0	4	4	2.012040e+13	4	2
9452360	20110914194324	9574456	Velikonda_Range	235410872	0	1	1	2.011091e+13	4	1
9452361	20110914194324	9574456	Velikonda_Range	235410872	0	2	1	2.011091e+13	4	1
9452362	20110914194324	9574456	Velikonda_Range	235410872	0	3	1	2.011091e+13	4	1
9452363	20110914194324	9574456	Velikonda_Range	235410872	0	4	1	2.011091e+13	4	1
29187900	20120211122213	9574456	Velikonda_Range	235410872	0	1	1	2.012021e+13	4	2
29187901	20120211122213	9574456	Velikonda_Range	235410872	0	2	1	2.012021e+13	4	2
29187902	20120211122213	9574456	Velikonda_Range	235410872	0	3	1	2.012021e+13	4	2
29187903	20120211122213	9574456	Velikonda_Range	235410872	0	4	1	2.012021e+13	4	2
31192136	20120227142735	9574456	Velikonda_Range	235410872	0	1	1	2.012023e+13	4	3
31192137	20120227142735	9574456	Velikonda_Range	235410872	0	2	1	2.012023e+13	4	3
31192138	20120227142735	9574456	Velikonda_Range	235410872	0	3	1	2.012023e+13	4	3
31192139	20120227142735	9574456	Velikonda_Range	235410872	0	4	1	2.012023e+13	4	3
1001688	20110726191624	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	1	1	2.011073e+13	4	1
1001689	20110726191624	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	2	4	2.011073e+13	4	1
1001690	20110726191624	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	3	1	2.011073e+13	4	1
1001691	20110726191624	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	4	3	2.011073e+13	4	1
21750440	20111209231300	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	1	4	2.011121e+13	4	2
21750441	20111209231300	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	2	4	2.011121e+13	4	2
21750442	20111209231300	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	3	1	2.011121e+13	4	2
21750443	20111209231300	14245160	Intestinal_cancer_in_cats_and_dogs	240879859	0	4	4	2.011121e+13	4	2
	...	...	...	...	...	...	...	...	...	...

	timestamp	page_id	page_title	rev_id	rating_key	rating_value	int_timestamp	obs_count	nth_rating
2771857	20110804150404	31877713	Software4Students	434558075	1	5	2.011080e+13	8	3
2771859	20110804150404	31877713	Software4Students	434558075	2	5	2.011080e+13	8	3
2771861	20110804150404	31877713	Software4Students	434558075	3	5	2.011080e+13	8	4
2771863	20110804150404	31877713	Software4Students	434558075	4	5	2.011080e+13	8	4

	timestamp	page_id	page_title	rev_id	rating_key	rating_value	int_timestamp	obs_count	nth_rating
2771856	20110804150404	31877713	Software4Students	434558075	1	5	2.011080e+13	8	3
2771857	20110804150404	31877713	Software4Students	434558075	1	5	2.011080e+13	8	3
2771858	20110804150404	31877713	Software4Students	434558075	2	5	2.011080e+13	8	3
2771859	20110804150404	31877713	Software4Students	434558075	2	5	2.011080e+13	8	3
2771860	20110804150404	31877713	Software4Students	434558075	3	5	2.011080e+13	8	4
2771861	20110804150404	31877713	Software4Students	434558075	3	5	2.011080e+13	8	4
2771862	20110804150404	31877713	Software4Students	434558075	4	5	2.011080e+13	8	4
2771863	20110804150404	31877713	Software4Students	434558075	4	5	2.011080e+13	8	4

	timestamp	page_id	page_title	rev_id	user_id	rating_key	rating_value	int_timestamp	obs_count	nth_rating
2770336	20110804145454	31877713	Software4Students	434558075	0	1	4	2.011080e+13	4	1
2770337	20110804145454	31877713	Software4Students	434558075	0	2	4	2.011080e+13	4	1
2770338	20110804145454	31877713	Software4Students	434558075	0	3	3	2.011080e+13	4	1
2770339	20110804145454	31877713	Software4Students	434558075	0	4	4	2.011080e+13	4	1
2771688	20110804150259	31877713	Software4Students	434558075	0	1	5	2.011080e+13	4	2
2771689	20110804150259	31877713	Software4Students	434558075	0	2	5	2.011080e+13	4	2
2771690	20110804150259	31877713	Software4Students	434558075	0	3	5	2.011080e+13	4	2
2771691	20110804150259	31877713	Software4Students	434558075	0	4	5	2.011080e+13	4	2
2771856	20110804150404	31877713	Software4Students	434558075	0	1	5	2.011080e+13	8	3
2771858	20110804150404	31877713	Software4Students	434558075	0	2	5	2.011080e+13	8	3
2771860	20110804150404	31877713	Software4Students	434558075	0	3	5	2.011080e+13	8	3
2771862	20110804150404	31877713	Software4Students	434558075	0	4	5	2.011080e+13	8	3
2772092	20110804150527	31877713	Software4Students	434558075	0	1	5	2.011080e+13	4	5
2772093	20110804150527	31877713	Software4Students	434558075	0	2	5	2.011080e+13	4	5
2772094	20110804150527	31877713	Software4Students	434558075	0	3	5	2.011080e+13	4	5
2772095	20110804150527	31877713	Software4Students	434558075	0	4	5	2.011080e+13	4	5
2772824	20110804151007	31877713	Software4Students	434558075	1	1	5	2.011080e+13	4	6
2772825	20110804151007	31877713	Software4Students	434558075	1	2	5	2.011080e+13	4	6
2772826	20110804151007	31877713	Software4Students	434558075	1	3	5	2.011080e+13	4	6
2772827	20110804151007	31877713	Software4Students	434558075	1	4	5	2.011080e+13	4	6
2772924	20110804151034	31877713	Software4Students	434558075	0	1	5	2.011080e+13	4	7
2772925	20110804151034	31877713	Software4Students	434558075	0	2	5	2.011080e+13	4	7
2772926	20110804151034	31877713	Software4Students	434558075	0	3	5	2.011080e+13	4	7
2772927	20110804151034	31877713	Software4Students	434558075	0	4	5	2.011080e+13	4	7
2774520	20110804152040	31877713	Software4Students	434558075	0	1	5	2.011080e+13	4	8
2774521	20110804152040	31877713	Software4Students	434558075	0	2	3	2.011080e+13	4	8
2774522	20110804152040	31877713	Software4Students	434558075	0	3	3	2.011080e+13	4	8
2774523	20110804152040	31877713	Software4Students	434558075	0	4	5	2.011080e+13	4	8
2785984	20110804162749	31877713	Software4Students	434558075	0	1	5	2.011080e+13	4	9
2785985	20110804162749	31877713	Software4Students	434558075	0	2	5	2.011080e+13	4	9
2785986	20110804162749	31877713	Software4Students	434558075	0	3	5	2.011080e+13	4	9
2785987	20110804162749	31877713	Software4Students	434558075	0	4	5	2.011080e+13	4	9
5767556	20110822115429	31877713	Software4Students	434558075	0	1	5	2.011082e+13	4	10
5767557	20110822115429	31877713	Software4Students	434558075	0	2	5	2.011082e+13	4	10
5767558	20110822115429	31877713	Software4Students	434558075	0	3	5	2.011082e+13	4	10
5767559	20110822115429	31877713	Software4Students	434558075	0	4	5	2.011082e+13	4	10

				rating_key	comp	obj	trust	writ
rev_id	nth_rating	page_id	page_title	user_id
132017785	1	7688277	Information_Harvesting	0	1	NaN	NaN	NaN
177504140	1	988664	Hygd	0	3	5	4	5
200528301	1	15306408	Northern_Territory_Certificate_of_Education	0	1	1	1	1
204958352	1	11191944	Methionine_adenosyltransferase	0	1	NaN	1	NaN
216938513	1	637164	Zuph	0	4	5	5	4
220802711	1	3283625	Course_(medicine)	0	NaN	NaN	NaN	4
220802711	2	3283625	Course_(medicine)	0	1	1	1	1
229691947	1	6689720	Integrated_bottom_line	1	1	1	1	2
231669276	1	13485005	Tangipahoa	0	NaN	5	5	5
231669276	2	13485005	Tangipahoa	0	4	4	4	4
235410872	1	9574456	Velikonda_Range	0	1	1	1	1
	2	9574456	Velikonda_Range	0	1	1	1	1
	3	9574456	Velikonda_Range	0	1	1	1	1
240879859	1	14245160	Intestinal_cancer_in_cats_and_dogs	0	1	4	1	3
	2	14245160	Intestinal_cancer_in_cats_and_dogs	0	1	4	4	4
	3	14245160	Intestinal_cancer_in_cats_and_dogs	0	1	1	1	1
242526672	1	11758353	Perceptual_attack_time	1	5	4	5	5
244967736	1	14714488	Southeast_Peninsula_(Saint_Kitts)	0	1	3	4	1
248234025	1	10223439	NGOSS-TNA	0	NaN	3	NaN	NaN
250841642	1	550298	Groupaction	0	1	3	NaN	3