Ghost Tweeter

This post on the Ghost forums ask for a way to automatically, and periodically tweet links to blog posts that are X days old. So, I made one.

GhostTweeter, is written in Python and uses tweepy and feedparser to cycle through a Ghost blog's RSS feed, get post information such as Title, URL, tags, and publish date. If a minimum age is set GhostTweeter will only tweet blog post older than the minimum age. GhostTweeter also turns your blog post tags into hashtags.

Tweets are sent in the following format:

<post title>: <url> #<tag1> #<tag2>  


Usage

Before you can start the Python file you must get your Twitter authentication and replace this in your code.

  1. Go to https://apps.twitter.com/app/new and create a new app
  2. Once created, copy and paste your API key and secret (will be under "Keys and Access Tokens")
  3. Create an access token, copy and paste those as well

This is the Twitter authentication part of the Python code, anywehere you see 'yourshere' you must place the correct token.

# Twitter Auth
CONSUMER_KEY = 'yourshere'  
CONSUMER_SECRET = 'yourshere'  
ACCESS_KEY = 'yourshere'  
ACCESS_SECRET = 'yourshere'  
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)  
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)  
api = tweepy.API(auth)  


Running the code

Once you have correctly entered your Twitter authentication info you must run the Python code. This code is written in Python 2.7 and will not work in Python 3 (yet). Open up your favorite Python terminal and type: python GhostTweeter.py

Now it will ask you for your Ghost url, as well as how often you want to tweet (in hours). Once that it all set you are ready to go!

Note: Depending on your OS, if you want to run this in the background you can simply run python GhostTweeter.py & and the code will do the rest of the work.

GhostTweeter on GitHub

This code is very easy to change and can possibly be used with any RSS feed with some simple modifications.

»

Python IRC Bot

I wrote (more of borrowed and molded) an IRC bot in high school and it is a giant piece of crap in terms of code. So recently I condensed it down and decided to redo it all. Here's all of the code, I'll break it down after.

To run it simply

python irc-bot.py <channel> <log.txt>  

Code

# Copyright (c) Twisted Matrix Laboratories
# See LICENSE for details.

#Default Imports
import requests


# twisted imports
from twisted.words.protocols import irc  
from twisted.internet import reactor, protocol  
from twisted.python import log

# system imports
import time, sys

#bot settings
nick = "Bot" #Name in chat  
server = "irc.freenode.net" #chat server

class MessageLogger:  
    """
    An independent logger class (because separation of application
    and protocol logic is a good thing).
    """
    def __init__(self, file):
        self.file = file

    def log(self, message):
        """Write a message to the file."""
        timestamp = time.strftime("[%H:%M:%S]", time.localtime(time.time()))
        self.file.write('%s %s\n' % (timestamp, message))
        self.file.flush()

    def close(self):
        self.file.close()


class LogBot(irc.IRCClient):  
    """A logging IRC bot."""

    nickname = nick

    def connectionMade(self):
        irc.IRCClient.connectionMade(self)
        self.logger = MessageLogger(open(self.factory.filename, "a"))
        self.logger.log("[connected at %s]" % 
                        time.asctime(time.localtime(time.time())))


    def connectionLost(self, reason):
        irc.IRCClient.connectionLost(self, reason)
        self.logger.log("[disconnected at %s]" % 
                        time.asctime(time.localtime(time.time())))
        self.logger.close()

    # callbacks for events

    def signedOn(self):
        """Called when bot has succesfully signed on to server."""
        self.join(self.factory.channel)

    def joined(self, channel):
        """This will get called when the bot joins the channel."""
        self.logger.log("[I have joined %s]" % channel)

    def privmsg(self, user, channel, msg):
        """This will get called when the bot receives a message."""
        user = user.split('!', 1)[0]
        self.logger.log("<%s> %s" % (user, msg))

        #Here we only let user "Hunter" run this command
        #which tells the bot to join a channel 
        #Example: ".join #python"
        if user == "Hunter":
            if msg.startswith(".join"):
                joinchannel = msg[6:]
                self.join(joinchannel)

        #Tells the bot to leave the channel, takes no arguments
        if user == "Hunter":
            if msg.startswith(".leave"):
                joinchannel = msg[7:]
                self.leave(channel)

        #Commands are structured like .command [args]
        if msg.startswith(".about"):
            msg = "A bot created by Hunter"
            self.msg(channel, msg)
            self.logger.log("<%s> %s" % (self.nickname, msg))

    def action(self, user, channel, msg):
        """This will get called when the bot sees someone do an action."""
        user = user.split('!', 1)[0]
        self.logger.log("* %s %s" % (user, msg))

    # irc callbacks

    def irc_NICK(self, prefix, params):
        """Called when an IRC user changes their nickname."""
        old_nick = prefix.split('!')[0]
        new_nick = params[0]
        self.logger.log("%s is now known as %s" % (old_nick, new_nick))


    # For fun, override the method that determines how a nickname is changed on
    # collisions. The default method appends an underscore.
    def alterCollidedNick(self, nickname):
        """
        Generate an altered version of a nickname that caused a collision in an
        effort to create an unused related name for subsequent registration.
        """
        return nickname + '^'



class LogBotFactory(protocol.ClientFactory):  
    """A factory for LogBots.

    A new protocol instance will be created each time we connect to the server.
    """

    def __init__(self, channel, filename):
        self.channel = channel
        self.filename = filename

    def buildProtocol(self, addr):
        p = LogBot()
        p.factory = self
        return p

    def clientConnectionLost(self, connector, reason):
        """If we get disconnected, reconnect to server."""
        connector.connect()

    def clientConnectionFailed(self, connector, reason):
        print "connection failed:", reason
        reactor.stop()

if __name__ == '__main__':  
    # initialize logging
    log.startLogging(sys.stdout)

    # create factory protocol and application
    f = LogBotFactory(sys.argv[1], sys.argv[2])

    # connect factory to this host and port
    reactor.connectTCP(server, 6667, f)

    # run bot
    reactor.run()
»

A Web Scraper That Sucks Even Less!

I recently wrote a post about using BeautifulSoup and urllib2 to scrape html off webpages and parse it into useful text. The only issue was it was easy to get banned with.

This modification to the code does not make you ban proof, and the same warning applies.

from bs4 import BeautifulSoup  
import urllib2  
import random  
import time

#random integer to select user agent
randomint = random.randint(0,7)

#random interger to select sleep time
randomtime = random.randint(1, 30)

#urls to be scraped
urls = ["http://www.hunterthornsberry.com", "http://huntert.me/"]

#user agents
user_agents = [  
    'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',
    'Opera/9.25 (Windows NT 5.1; U; en)',
    'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
    'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',
    'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.142 Safari/535.19',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0.1) Gecko/20100101 Firefox/8.0.1',
    'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.151 Safari/535.19'
]


index = 0  
while len(urls) > index:  
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', user_agents[randomint])]
    response = opener.open(urls[index])
    the_page = response.read()
    soup = BeautifulSoup(the_page)  

    #Search criteria (is an html tag). Example <p>, <body>, <h1>, etc.
    text = soup.findAll("body")

    #Runs until it has an index out of range error and breaks, this will return every response
    while True:
        try:
            i = 0
            while True:
                print text[i].text
                i = i + 1
        except IndexError:
            print "--End--"
            break
    index = index + 1
    time.sleep(randomtime)

What I've done here is taken a list of common user-agents and randomly selected one to be passed with our HTTP request, this makes our request look as if they are coming from different browsers. On top of that I've added a random wait period (1-30 seconds) after each request.

»