Here’s a page out of my thesis. Let me know if you spot any errors.
The growth in the total amount of blogs is not the relevant metric for estimating the global conversation known as the blogosphere because much of the growth could be coming from inactive accounts as users try out the technology and abandon it. In essence, it is akin to people joining in a conversation without saying anything or participating in a verbal exchange. Marlow (2003) has blogged some of his work with the MIT Media Laboratory and has commented on the impact of blog churn on estimates of the “core” blogosphere. Sifry’s categorization of blogs into those having been updated in the past week can be considered “core” users of the technology whereas those who have updated as recently as three months prior are considered to be “active” users. Those who have not updated are arbitrarily defined as “inactive” users.
Marlow (2003) represents this concept graphically, but bases it on the possibly inadequate assumption that the proportions of inactive, active and core bloggers remain constant over time. The assumption is potentially untenable because bloggers can make transitions from being active to inactive or even core bloggers, depending on their situation. For example, a core blogger may take a break from blogging in order to travel to some part of the world for an extended period. After one week of not posting, this blogger would be classified as active, and after three months of absence would become inactive. Marlow summarizes it best when he states:
“Above is a chart of this phenomenon over time. The active user population is assume to be half of the total, while the core users are those that have been active for more than 6 months. Note that the core user population does grow exponentially, albeit much more slowly than the total population.
This is just a simple example to show that much better statistics are needed to calculate the true active population of the weblog community. Most of all, we need more information about the users of centrally maintained weblog services. What is the distribution of use over the active period? My assumption, based on my previous research with prior communities, is that most of these users are experimenting with the system, and not among the core group.”
If Marlow’s assumption of equal proportions through time were to hold, the growth constants for total, active and core bloggers would have to be equal through time. However, these constants vary at any point in time as bloggers transition through the Sifry classifications. Therefore, quarterly reports of the growth constants would allow researchers to understand this particular dynamic of Sifry’s quarterly reports on blogosphere growth. The total amount of blogs in existence tells us nothing about blogosphere activity if bloggers are signing up, but not saying anything to eachother. The result is an unnecessary amount of congestion resulting from users trying out the technology and then moving on.
Using the equation for exponential growth and decay, it is not difficult to show that in order for total blogosphere growth to proxy active blogosphere growth, the growth constants must equal.
i. Given the equations for exponential growth and decay:
Ln(N/No) = ki*t OR N/No = e^ki*t
Where: N and No are observable data for To, T, Ao, A, Io, I and Co, C at the beginning and end of the observation period, t, respectively
Where: kT, kA, kI and kC are the growth constants for Total, Active, Inactive and Core Accounts.
ii. The accounts are defined as:
T = Total accounts after time t
To = Total accounts at start of time period
A = Active accounts after time t
Ao = Active accounts at start of time period
C = Core accounts after time t
Co = Core accounts at start of time t
iii. And, given the following identites:
T = I + A + C AND To = Io + Ao + Co
iv. And, given the following scalar definitions:
Φo, Φ are the proportion of Active accounts relative to Total accounts at the beginning and end of the observed period, respectively.
Βo, β are theproportion of Core accounts relative to Total accounts at the beginning and end of the observed period, respectively.
1-Φ-β is the scalar proportion of Inactive accounts
0 < Φo < 1, 0 < βo < 1 and 0 < Φo + βo < 1
0 < Φ < 1, 0 < β < 1 and 0 < Φ + β < 1
Assumption: Φo = Φ, βo = β (i.e. they are constant through time).
iv. Such that the following equations hold:
C = βT, Co = βTo, A = ΦT, Ao = ΦTo, I = (1-Φ-β)T, Io = (1-Φ-β)To
v. Solution:
The exponential growth equation implies:
T/To = ekT*t , A/Ao = ekA*t , I/Io = ekI*t , C/Co = ekC*t
Substitution of the equations in iv shows that:
T/To = ekT*t , T/To = ekA*t , T/To = ekI*t , T/To = ekC*t
Which implies: kT = kA = kI = kC
Available data, however, show that Φ ,β and (1-β-Φ) are not constant through time, so the scalar definitions are not satisfied. Active, core and inactive accounts, as a proportion of total, vary throughout time because bloggers transition through these categories depending on their situation.
Sifry’s data for the period of July, 2005 to January, 2006 show that the scalar parameters are changing through time. While the blogosphere as defined by Technorati is doubling every 0.53 years, the doubling time of inactive bloggers over this period was slightly faster – doubling every 0.46 years. The doubling time for active and core accounts were much slower – doubling every 0.57 and 0.92 years, respectively.

The table above is based on two passages of Sifry’s reports for July, 2005 and January, 2006 and it adjusts the reported data to reflect the fact that the category of active bloggers, or those who have updated in the past three months, includes the category of core bloggers, or those who have updated in the past week. For the period of July, 2005 to January, 2006, Technorati saw 13 million blogs added to its service. As 13 million blogs were added, 7.1 million bloggers lapsed into activity while 5.9 million joined the ranks of active and core bloggers. The trends are diverging, and should they continue, the active and core blogosphere will shrink in proportion to the total blogosphere. The scalar coefficients differ at the beginning and end of the observation period, and, notably, so do the growth constants. The proportions of inactive, active and core bloggers differ at the beginning and end of the time period observed, Therefore, the assumption that Φo = Φ, βo = β does not hold, and so total accounts cannot act as a proxy for active accounts.
Sphere: Related Content










One Trackback/Pingback
[…] The issue of ‘blog churn‘ is the reason why I track dead blogs in Alberta. For more infomration, please go HERE. Many new bloggers toy with the technology for a while, then they become discouraged, and simply abandon their blogs. Thus, it’s important to track the rate of growth of inactive blogs versus the rate of active ones. Why? It tells us if people are really using the technology. Sure, the blogosphere is growing, but how many of those blogs are even active? […]
Post a Comment