Hi! I know you are probably asking what the heck is he doing? I will be honest here, this is not a tutorial but a simple quick fix to a problem I faced last night. I was invited by an author to take part in the so called blog hopping. One of the requirements was to find at least five other authors with blogs to join the adventure. So, I asked myself, how could I make this as much fun as possible and learn something at the same time?
So today, I packed my gear and headed to a nearby coffee shop to create something. In a few lines later (I had to wait for one hour after I exceeded 150 rate limit on Twitter), I had put together a simple script that helped me reduce the pain of searching for potential bloggers(who are authors).
- I assumed that Twitter users who mentioned the word ‘authors’ were likely to be authors themselves or review books through their blogs.
- Secondly, I made a partial assumption that most twitter users who talk about authors had their ‘website’ part of their profiles filled. So, I could easily get their blog addresses. If they didn’t have anything listed, I just moved on.
Now let us take a closer look at the code: part 1
#Let us get our first part out of the way here. #I am using python, so let us import the needed libraries import json import urllib2 url = 'http://search.twitter.com/search.json?q=authors' #link user_ids =  #list to store user_ids #define our first of the two functions to be used def get_user_ids(): data = urllib2.urlopen(url) #make the call to twitter js = json.load(data) #parse using json.load() method i = 0 while i < len(js['results']): #you need to know what is returned user_ids.append(js['results'][i]['from_user_id_str'].encode('utf-8')) i += 1 return user_ids #Explanation: #after getting data from twitter, I iterate through the 'results' #and grab the from_user_id_str - whoever mentioned the word 'authors' #I find this more convenient than using a screen name. #I then add that value to the user_ids list for later use. #finally, I return the list print 'Move to the next level now!'
In the second part of this post, I want us to look at the second function definition that will complete our script. Yeah, it is really short!
#Within the same file, we will continue our script! def get_blogs(): i = 0 user_urls =  #define a list to store the urls user_ids = get_user_ids() #get the user_ids and store for use while i < len(user_ids): url = 'http://api.twitter.com/1/statuses/user_timeline/'+ user_ids[i] + '.json' data = urllib2.urlopen(url) #make the call js = json.load(data) if js['user']['url'] is not None: user_urls.append(js['user']['url']) i += 1 return user_urls
Can you believe we have finished the script? Well, we have reached the end. I know you are saying: show me! And to that question, an answer is worth it – in a snapshot! Let us run this code by doing the following:
#Call the function and store the links in a list. user_links = get_blogs() for link in user_links: print link #That is all we need to get everything from the list!!
And ….. here is what I got when I executed that code!
As you can see, you get easy to read urls that you can open using your browser of choice! One thing you might notice is that not all of the links are either wordpress or blogspot. You can go ahead and improve the script to grab only those links that have either wordpress or blogspot in them. You can view the image clearly by clicking on it!
So why is this a better idea than searching on Google for individual bloggers? Simple; you get a ton of links that you can scan through using your browser, sending requests to their owners and saving yourself some time!
As far as time is concerned, this script makes several API calls during the execution and you will notice a slight delay in completion. Also, you might hit the required rate limit (150) without knowing because you are making several requests per execution. That being said, I still had fun doing this and I have sent numerous requests for blog hops within a short time!
I make an API call to Twitter servers to search for the word ‘authors’. Then saved the user_ids of the people who mentioned that word. We then use their user_ids to make another API call to their timelines. While doing that, we grab their url (the property that is part of the Twitter profile). If the url is missing, we ignore, otherwise, we store it in a new list. We finally iterate through the list and print the links out in a clean manner then visit them individually! That is it! I hope you had fun reading through this post. Got questions? Ask them! Thank you.