AOL’s research department released a dataset containing the search history of 500 thousand users with 20 million search terms. They released it for: “The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research.” AOL soon removed the data from their site, but the damage has already been done and mirrors are all over the net.
This data is very valuable for marketers, SEOs, and spammers. The problem with the data is that it identifies users with a unique id. Thus all searched from a particular user are related with this id. And with enough searches it can be possible to determine who the person is. Since AOL uses google as its search engine, this is essentually the same data that Google fought the goverment to keep it from them. Now it is all over the net. And people are finding all kinds of interesting info.
It is only a matter of time until someone releases a web interface to search and parse this data. I am sure google link spammers are already parsing this data to find the best keywords to spam. I would imagine that google will have an interesting response soon. And this is definitly going to hurt aol. I am glad that I am not using aol for anything other than aim, but it would not suprise me if I found my chats online soon.
Of course I have already downloaded the data, and though I don’t have much time with moving in two weeks, I will probably import the data in MYSQL and do a few queries. >:)
I have finally created a JavaScript version of my C# password generator. The javascript version located here will generate multiple random passwords containing upper and lower case letters, numbers, and symbols. Just hit the generate button to get your passwords. Remember your passwords and keep them in a safe place.
I have several ideas to extend the functionality of this tool, and I will work on them when I get time over the next few months. If you have any suggestions please let me know.
I recently came across SharpMail, a UK based company that offers a fake email service similar to the service I host here. They offer a lot of cool features like reply back, file attachments, SMS for text messages (doesn’t work in US), rich text messages, and premade prank emails. However they have several features that I don’t like and that make my service better. First you have to register an account with them to do anything. Second they put a very noticeable link in the email. So the recipient knows very quickly that the email is fake. They also have a huge x-header that alerts to the fact that it is a prank. For $35 a year, you can remove these. So if you want to send a more truely anonymous (and free) email try out this. It is my goal to add a few more features to the script, like an optional reply feature, and maybe a new form with a rich text editor. I am also working on a C# program that will do the same stuff.
A honeypot is a computer system that is designed with the intent to catch hackers. It is positioned in a network in a spot where it is a good target for hackers. Honeypots can be used to detect malicious activity on a network or to prevent hackers from hacking a network by being a decoy. Honeypots are also frequently used for research to detect and analyze new worms and attacks. There are two basic catagories of honeypots, high-interaction and low-interaction. A high-interaction honeypot is a system that is designed to be completely compromised. A low-interaction honeypot is a system that simulates different parts of a network system. In this article we are going to build a low-interaction honeypot with the Windows program HoneyBot.
Honeybot which can be downloaded here is a Windows program that opens over 1200 TCP and UDP ports and simulates common services on them. It then captures all packet traffic to these ports and logs the packets and IP address. It is able to simulate some basic services by replying on certain ports. It is also able to capture worms and trojans by saving them to a folder. It is an easy to use program that is a good choice for getting your feet wet with honeypots.
Continue reading ‘Run a Low-Interaction Honeypot with HoneyBot’
Google just released the limited beta version of their new online Spreadsheet application. I signed up as soon as it was available and received my invitation a few hours later. After working on a few spreadsheets with it, I found it to be a nice, easy to use spreadsheet application. I think it has a lot of potential. However, I am not planning to switch from Excel to Google Spreadsheets anytime soon. This got me thinking about the pros and cons of online office apps, and I hace concluded that most office apps have a long way to go before they are widely used. So here is a list of some of the cons of online applications, and my thoughts about them.
Continue reading ‘Will Online Office Apps take over the desktop?’
I’ve always known how to do basic IP sniffing, but with all the recent news focus on the Data Mining of the NSA and AT&T I decided to do a little research and dig into IP sniffing. Obviously the NSA uses some pretty sophisticated software and hardware to handle all the IP data that they collect, but there are plenty of open source tools that will do pretty much the same stuff for a smaller network.
The best program for packet capture and analysis is Ethereal. It captures packets and displays them in a nice GUI. It can also save the packets to a file and open and process captured packets files. It has the ability to process the packets by applying filters. For example, you could filter out all arp traffic, or only capture http. Ethereal also allows you to filter by TCP stream. It can display all the data portions of a packet in the stream that they came in. In this way, you could reconstruct an html page, or smtp email. However the purpose of this article is not to be a guide on Ethereal, but to show you how to arrange your network to sniff your internet connection and capture all packets coming and going across your internet pipe.
There are many reasons that you might have to want to sniff your internet connection, or even to capture and record all packets that are passing through. One reason is that it is a fascinating and great way to learn about networks and how packets flow through the network. Another reason could be to find and defeat a hacking attack or malware. You could also monitor your network to determine what users are doing and watch them. (like the nsa) Continue reading ‘Network Layouts for IP Sniffing’
I’ve had a very busy last few weeks, working on some different project for at work, and finishing up finals for my night classes. I’ve learned several interesting things in the past two weeks at my job.
The first thing that I learned was that Dell support people will bug you until you fix your computer. A hard drive went bad in one of our production servers. So I called Dell Gold support (which thankfully has american techs.) to get a replacement. After a lot of discussion, the tech told me to run a firmware update which would fix the issue. So I had to explain to him that it was a production server, and to do the fix he wanted would require me to schedule downtime and then go in to the hosted environment on a Saturday and perform the fix. Continue reading ‘Lessons learned from IT this week’
I have been using the google personal homepage www.google.com/ig as my home page, since the day it came out. I use it on several browsers on several different computers. This morning, when I opened up Firefox at my work computer, I got a google capcha screen, that I have shown below: (Click to view full size)

I figured it must be some virus or spyware, but multiple scans revealed nothing, and I keep a close eye on my computer too. So I tried IE7 and got the same result. When I tried to access the homepage from my home computer via my vnc connection, it came up just fine. So I rebooted my work pc. Same thing. I then tried it from another work computer(a brand new Dell laptop right out of the box no spyware there). And got the same capcha screen! So my conclusion is that google is blocking my IP address from the home page. Everytime I reopen my browser I have to reenter a capcha code. This is pretty annoying, since I open my browser a lot. If it doesn’t go away in the next day or two, I am definitly going to move my homepage to something else. Has this happened to anyone else?
Update:
I learned from some comments on Digg, that this is related to some Di Vinci Code Quest that Google is running from the personalized homepage. I’m not sure why I was affected, but it seems that this has happened to a lot of people. The Quest is ending soon, so it should soon return to normal.
I came across a very useful tool for logging port use in Windows. It is called the Port Reporter. This tool runs as a service on a Windows 2000, XP, or 2003 computer. It logs all TCP and UDP port use to log files. A seperate utility called the Port Reporter Parser provides a nice GUI interface for viewing the log files and analyzing the data.
Continue reading ‘Port Reporter, a Windows tool for logging port use’
I like to monitor the stats for my website very closely, to see how visitors find the articles on my site. I have noticed that the majority of visitors come from google or some other search engine. I am a big fan of google and use it all the time. In wordpress, I use the google sitemap plugin to generate a sitemap and automatically submit it to google. But through keeping an eye on my mint stats (mint is the best stat program for a website ever), I have noticed that when I write a new article, there are no hits to that article from google for at least two weeks. For example, exactly two weeks ago, I wrote an article about memory dumps. The last two weeks, I have not received a single hit on that article from any searches on google. Tonight I suddenly received several hits on that article. I have also noticed this same thing with other articles that I have written. It doesn’t bother me that it takes two weeks for search results to find my articles, I just found it interesting. So if you are running a website just know that it will take two weeks for search results to return your site.