Development

DIY: Twitter Share Counts (Part 2)

In part one, I showed you how to talk to the Twitter API, get the tweets, and count them. In this post, I’m going to show you how to handle those popular posts with 100+ tweet shares, we’re going to loop over (or paginate) the results and store them accordingly with recursion. It’s not as simple as it sounds, so brace yourself! We’re going to jump right into the code!

What is recursion?

My interpretation of recursion is reusing the same function to loop over itself, in order to increment, or process data. If however, you want to REALLY read into recursion, there is a great Wikipedia article on recursion. Recursion is used in MANY languages, and has many patterns some of which you may find even in WordPress core.

For this article we’ll be working with Direct Recursion, which is when a function calls itself in order to complete its loop. So lets jump into it, shall we?

The Code!!!

Note: IF you haven’t read part 1 of this series, this will make little sense to you, so go read it if you haven’t.

We first need to setup a stack or variable, as you know it, to hold our counts. Since we’re using a class, it’s as simple as setting up a property in which we can access it. We can just call it $count, nothing special. We now need to update the Twitter_Counts::tweet_count() method to accept arguments, because we’re going to be using some results that we get back from Twitter’s response. While we’re in here, why don’t we clean up the code a bit and condense those two separate ‘if’ statements?

Your code should look something like this:

    public function tweet_count( $post_id, $args ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return false;
        }

        return count( $statuses->statuses );
    }

In the above code, you’ll notice we added the $args to the function signature and defaulted it to an array().  This is so we don’t break any existing page templates using our code already, as well as not having to re-pass the entire array every time. So, since we don’t want to pass the entire array, I decided to run it through wp_parse_args(), saves time.

Looping over pages

Recursion, good sir or madam, recursion! If you don’t already have it open, check the official documentation on the search/tweets endpoint. Look at the search_metadata–it should look something like this:

  "search_metadata": {
    "max_id": 250126199840518145,
    "since_id": 24012619984051000,
    "refresh_url": "?since_id=250126199840518145&q=%23freebandnames&result_type=mixed&include_entities=1",
    "next_results": "?max_id=249279667666817023&q=%23freebandnames&count=4&include_entities=1&result_type=mixed",
    "count": 4,
    "completed_in": 0.035,
    "since_id_str": "24012619984051000",
    "query": "%23freebandnames",
    "max_id_str": "250126199840518145"
  }

From playing around with the API, I noticed that if there is a next page, that’s when the next_results key shows up. But we already are using count and include_entities, and well.. result_type defaults to mixed, so we don’t need all that fancy stuff for our recursion.

What we absolutely DO need is the max_id value. This value tells us where we left off at in our loop. How do we parse this string? First, we need to cut off the question mark at the beginning. To do this, PHP has a nifty function called ltrim() which allows us to strip of specific characters and spaces from the beginning of the string. We then pass the resulting string through parse_str().

What parse_str() is going to do is create variables based on the URL keys in the search_metadata results. Now we need to make sure we get back the max_id variable, and if so, we move onto the next part. The code should look something like this:

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return false;
        }

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            $next_results = ltrim( $statuses->search_metadata->next_results, '?' );
            parse_str( $next_results );
            if ( isset( $max_id ) ) {
                // We have a next page
            }
        }

        return count( $statuses->statuses );
    }

To make the method recursive, we’re essentially just going to have it call itself. But what about the counts? If we use the method to call itself, we’d be overwriting the counts we’re returning! To fix this, we need to handle the counts a little differently, so remember that count property in the first section? This is where we’ll use it.

We first need to update our if checks for the statuses to instead return the count property, instead of a boolean value. Once that’s done, we need to also increment the count outside of this if check, but before our metadata check. This should be more evident shortly.

Once we do that, we need to update the arguments we’re passing back to the internal method; this is why we used wp_parse_args(), so we only have to pass in the max_id parameter so we tell Twitter where to begin from. Your code should look like this:

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return $this->count;
        }

        $this->count = $this->count + count( $statuses->statuses );

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            $next_results = ltrim( $statuses->search_metadata->next_results, '?' );
            parse_str( $next_results );
            if ( isset( $max_id ) ) {
                // We have a next page
                $this->tweet_count( $post_id, array( 'max_id' => $max_id ) );
            }
        }

        return $this->count;
    }

Working with Rate Limits

Over time you’re absolutely going to hit the rate limit. In fact, a single well-shared post can hit the limit almost immediately. So how do we handle this? Enter, post_meta!

We know from looking at the API that we need to at minimum store the max_id variable we get back. Storing the count is quite obvious yea? Let’s do that!

When we process the tweet query, we have the last max_id in the args already, and we know if we’ve hit a rate limit, we will no longer be able to access the statuses property of the result set. If you were to print_r() the statuses when it fails you’d see this:

[10-Dec-2015 18:54:22 UTC] stdClass Object
(
    [errors] => Array
        (
            [0] => stdClass Object
                (
                    [message] => Rate limit exceeded
                    [code] => 88
                )

        )

)

The message speaks for itself, we’ve hit the rate limit! So, let’s store some data for the future. As I said we’re going to store the max_id parameter, and the count. In the interest of making this class as modular as possible, let’s make a helper method for saving the data.

To begin, we create a new set_count method, which of course includes two parameters, the post ID and an $args parameter because we want to save both the count, and the new max_id. We need to update our returns to use this new method.

Most would say, “Why create a whole method for that?” Well for me, it’s purely because it is used in two places in a single method, so instead of duplicating the code, it’s a lot cleaner, and lessens the chance of forgetting to change it in both locations.

So far you should have something like this:

    public function save_count( $post_id, $args ) {
        $save = array(
            'count' => $this->count,
        );
        if ( isset( $args['max_id'] ) ) {
            $save['max_id'] = $args['max_id'];
        }

        update_post_meta( $post_id, 'tweet_data', $save );

        return $this->count;
    }

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return $this->save_count( $post_id, $args );
        }

        $this->count = $this->count + count( $statuses->statuses );

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            $next_results = ltrim( $statuses->search_metadata->next_results, '?' );
            parse_str( $next_results );
            if ( isset( $max_id ) ) {
                // We have a next page
                $this->tweet_count( $post_id, array( 'max_id' => $max_id ) );
            }
        }

        return $this->save_count( $post_id, $args );
    }

It’s a bit shotty, but works like it is.  We now need to update our default arguments, only if max_id is being passed in, to do that we first need check post meta and update the count property, and set the max_id if it’s already set in post meta. So let’s do that, pretty simple change. Should look something similar to this:

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $meta = get_post_meta( $post_id, 'tweet_data' );
        $meta = is_array( $meta ) ? array_shift( $meta ) : array();

        $this->count = isset( $meta['count'] ) && 0 == $this->count ? $meta['count'] : $this->count;

        if ( isset( $meta['max_id'] ) ) {
            $defaults['max_id'] = $meta['max_id'];
        }

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return $this->save_count( $post_id, $args );
        }

        $this->count = $this->count + count( $statuses->statuses );

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            $next_results = ltrim( $statuses->search_metadata->next_results, '?' );
            parse_str( $next_results );
            if ( isset( $max_id ) ) {
                // We have a next page
                $this->tweet_count( $post_id, array( 'max_id' => $max_id ) );
            }
        }

        return $this->save_count( $post_id, $args );
    }

The reason I put it in defaults is that I want to make 100% sure we can overwrite the post meta. By dropping it before parse args, this means that any parameter that’s passed through $args can ignore it, and in-turn, skip tweets if we want to. This is entirely optional of course–just as long as the parameter is set before you run $oauth->get().

What if you only return, say, fifty tweets? Well the script wouldn’t ever set the max_id, and would start from the beginning again, and since we’re storing counts in post_meta… that’s a bad idea–we’d essentially increment counts over time.

We need to update the code to always have a max_id stored, right? Well here’s a little caveat: max_id is technically a 64bit integer. Twitter’s timeline keeps tweets in a numerically-descending format, meaning the first tweet in history could be 00000000000001 (serialization if you will).

So how do we handle the max_id, subtract 1 from the earliest id? Yeah, if your system supports it, but if not, you’ll get a memory leak and PHP will let you know you’re out of memory.

To determine if you can simply subtract 1 from the max_id key, you can do this: php -r 'echo PHP_INT_MAX;' which will tell you the largest number PHP can deal with when doing calculations. On most systems (32bit) it will be 2147483647, and tweet IDs are longer than that.

If you can’t, or you’re like me and don’t have PHP 64bit installed, you’re still able to handle this, so let’s do it the 32bit way.

In the code we know that if we get the statuses object back, each tweet has an ID, so we simply need to get the last tweet ID and set max_id if it’s not already set. Since tweet statuses are stored in a numerical array, we only need to use end() to get the last one, and grab the tweet ID. The code should look something like this:

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );


        $meta = get_post_meta( $post_id, 'tweet_data' );
        $meta = is_array( $meta ) ? array_shift( $meta ) : array();
        
        $this->count = isset( $meta['count'] ) && 0 == $this->count ? $meta['count'] : $this->count;

        if ( isset( $meta['max_id'] ) ) {
            $defaults['max_id'] = $meta['max_id'];
        }

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return $this->save_count( $post_id, $args );
        }

        $this->count = $this->count + count( $statuses->statuses );

        if ( ! isset( $args['max_id'] ) ) {
            $last_status = end( $statuses->statuses );
            if ( isset( $last_status->id ) ) {
                $args['max_id'] = $last_status->id;
            }
        }

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            $next_results = ltrim( $statuses->search_metadata->next_results, '?' );
            parse_str( $next_results );
            if ( isset( $max_id ) ) {
                // We have a next page
                $this->tweet_count( $post_id, array( 'max_id' => $max_id ) );
            }
        }

        return $this->save_count( $post_id, $args );
    }
}

A note about using max_id

After reading through Twitter’s documentation (which I didn’t do until the end of this article, so no, I’m not perfect–don’t judge me), I learned the tweets returned include the last tweet from the last query. Read more about paging twitter timelines here.

What does this mean? Is the code broken? No, why would you ask that!!! You just get a few extra counts, that’s not too bad, right? To illustrate, look at this gist: https://gist.github.com/a56831e5cf1775f1351d

See the max_id is one more than the last tweet ID; this is intentional on Twitter’s part.  How do we handle it? Just subtract it by 1? Well no, er, you can if you’re running 64bit PHP, but I’m not, so let’s update the code to handle it in a 32bit situation.

First we have to look at the search_metadata section checks we added. We definitely want to set max_id here, but we don’t want to use the max_id in the next_results key. Since we want to use the last tweet instead, let’s clean up the code a bit:

    public function tweet_count( $post_id, $args = array() ) {

        $defaults = array(
            'q'                => get_permalink( $post_id ),
            'count'            => 100,
            'include_entities' => true,
        );

        $meta = get_post_meta( $post_id, 'tweet_data' );
        $meta = is_array( $meta ) ? array_shift( $meta ) : array();

        $this->count = isset( $meta['count'] ) && 0 == $this->count ? $meta['count'] : $this->count;

        if ( isset( $meta['max_id'] ) ) {
            $defaults['max_id'] = $meta['max_id'];
        }

        $args = wp_parse_args( $args, $defaults );

        $oauth = new TwitterOAuth( $this->consumer_key, $this->consumer_secret, $this->access_token, $this->access_secret );

        $statuses = $oauth->get( 'search/tweets', $args );

        if ( ! $statuses || ! isset( $statuses->statuses ) ) {
            return $this->save_count( $post_id, $args );
        }

        $this->count = $this->count + count( $statuses->statuses );
        $last_status = end( $statuses->statuses );
        $last_id = isset( $last_status->id ) ? $last_status->id : false;

        if ( ! isset( $args['max_id'] ) ) {
            if ( $last_id ) {
                $args['max_id'] = $last_id;
            }
        }

        if ( isset( $statuses->search_metadata ) && isset( $statuses->search_metadata->next_results ) ) {
            if ( isset( $last_id ) ) {
                // We have a next page
                $this->tweet_count( $post_id, array( 'max_id' => $last_id ) );
            }
        }

        return $this->save_count( $post_id, $args );
    }

Conclusion

I hope this article (and its predecessor) has helped you in at least understanding both the Twitter API, and how absolutely easy it is to work with thanks to Abraham’s OAuth library. Not to mention a little bit of recursion!

I realize there are things in the code which could have changed, condensed, or otherwise, but I really didn’t want to overcomplicate the scope of the DIY beyond what it already is. You may also need to alter the loop/recursion a bit if your server has a low amount of resources.

With all things considered, I hope these articles helped you and I look forward to reading your responses.

Full Source – https://gist.github.com/JayWood/3f7d3bf6ee4bd57ba88c

Comments

Have a comment?

Your email address will not be published. Required fields are marked *

accessibilityadminaggregationanchorarrow-rightattach-iconbackupsblogbookmarksbuddypresscachingcalendarcaret-downcartunifiedcouponcrediblecredit-cardcustommigrationdesigndevecomfriendsgallerygoodgroupsgrowthhostingideasinternationalizationiphoneloyaltymailmaphealthmessagingArtboard 1migrationsmultiple-sourcesmultisitenewsnotificationsperformancephonepluginprofilesresearcharrowscalablescrapingsecuresecureseosharearrowarrowsourcestreamsupporttwitchunifiedupdatesvaultwebsitewordpress