Quantcast
Channel: Projects
Viewing all articles
Browse latest Browse all 10

Open Proxy List From Spams

$
0
0
[rokdownload menuitem="439" downloaditem="6" direct_download="true"]Click here to Download this Article[/rokdownload]

 

Open Proxy List from SPAMs

Chiung-ting Chen

Abstract

SPAM mails are still a large security issue nowadays. The author tries to do some research based on the SPAM mails sent to his personal mailbox. After some investigation, a list of open-proxies is constructed based on the IP of the immediate host to the mail server. These are real open-proxies in the wild which can be treated as a hiding node f or evil internet activities. An experimental mail server plugin is developed to ban mails from open proxy. We not only banned the mails, but also collected a list of open proxies. This information can be shared with others to improve each other’s anti-spam work.

 

Keywords: SPAM, open-proxy, security, postfix, content filter

Open Proxy List from SPAMs

SPAM mail has been a security issue since 1978 (Templeton, n.d.), and this problem has never been solved. Several techniques have been applied to reduce the amount of SPAM mail, such as greylisting, DNSBL, and heuristic content checking.

In the author’s mail server environment, heuristic content checking is applied. Mails that are determined to be SPAM by heuristic content checking get through to the user’s mailbox, with a [SPAM] tag on it.

In recent days, there are still lots of tagged SPAM mail in the author’s personal mailbox, which causes the author’s interest and hence an investigation on these SPAM mails were done.

In the first section, there is a brief introduction to related anti-spam techniques. The following section describes the procedure of the investigation. An observation is made after the investigation, and a prototype of a mail server plugin is developed, and it is described in the third section. Finally, we conclude in the last part of the article.

Related anti-spam techniques

There are several methods to block SPAM mail. The following are the most common seen mechanisms.

Grey Listing

A mail server that follows the SMTP RFC will try to deliver the mail several times if the delivery fails. The main idea of grey listing is to block unrecognized mail server for the first time. If it gets the mail afterwards, it will mark this server as a good one. The behavior works very well because most of the SPAM mails are sent by a program that does not follow this retry rule, and they are sent by arbitrary hosts instead of a server with fixed IP. If the SPAM mail cannot be delivered at the first time, it will never retry. The drawback of this mechanism is that mails will not be delivered instantly if the mail server was blocked for the first time. This behavior would be not appreciated by users since e-mails should be delivered instantly.

DNSBL

DNSBL stands for DNS based Blacklist. Several companies or organizations host servers to indicate bad IP addresses. The IP address usually refers to the host that connects to the mail server. It leverages the DNS protocol for mail servers to perform the lookup. Once a mail server gets a connection from host A.B.C.D, it asks DNS server for the record of D.C.B.A.dnsbl.example.net where dnsbl.example.net is the provider of the list. If it gets an “A” record, it means this host is listed in blacklist and should be blocked.

The source of this list could be reported by users, or collected by searching for open relay mail servers. The downside of this method is that home users might get blocked because the IP is dynamic.

Heuristic Content Checking

This method examines the content of the mail. There are lots of rules for the checking, each has a weighted score. After the examination, the scores are accumulated, and if it reaches a threshold, which is set by the mail server administrator, it regards the mail as SPAM. Some of the rules check if the mail content follows RFC. However, lots of automatic generated mails do not follow, and this usually causes mails like library notifications or electronic bills to be tagged as SPAM as well. The other drawback is that these checkings are usually slow since it requires lots of regular expression matching.

 

Investigation

To start the investigation, we focus on the mails that are tagged as SPAM by the heuristic content checking. After collecting the mails, we should look at the mail headers first. Mail headers are the metadata of a mail. In normal cases, it includes the sender’s and receiver’s e-mail addresses, the subject, date, and any other attributes of the mail. It also reveals some important information such as the delivery route of this mail, the MUA (mail user agent), and data appended by mail servers.

The investigation was first approached by making visualizations of the data in the headers. However, it did not work well. The investigation direction changed toward digging into individual mails. Random SPAM mails were then picked and headers were inspected, and there was a major finding. Most of the hosts that connected to the local mail server were open proxies.

Visualization

In order to ease the effort of going through all these headers, visualization of the data is a good start to look for clues (Marty, 2008). An assumption is made here. All the hosts involved with the delivery route of SPAM mails are somehow related to each other. With this assumption, we could retrieve every IP of the hosts that delivered the mail, and draw a connecting link graph. We should find more clues after the graph is drawn. The focusing point might be the nodes that are highly connected, or the nodes that are connected in a unique way.

We could get the routing information from the “Received” header. Below is an example. It is a portion of header that only includes the Received headers. The author’s local mail server IP and the author’s e-mail address are replaced with fake ones in all examples across this article.

Received: from aboj.cc (localhost [127.0.0.1])

by aboj.cc (Postfix) with ESMTP id AF5E46D44A

for <chen@aboj.cc>; Fri, 15 Oct 2010 07:13:48 +0800 (CST)

Received: from lab.anonymous.edu.tw (lab.anonymous.edu.tw [192.168.155.134])

by aboj.cc (Postfix) with ESMTP id 909376D447

for <chen@aboj.cc>; Fri, 15 Oct 2010 07:13:48 +0800 (CST)

Received: by lab.anonymous.edu.tw (Postfix)

id 2431D50F3DE; Fri, 15 Oct 2010 07:13:05 +0800 (CST)

Delivered-To: chen@lab.anonymous.edu.tw

Received: from cpe-67-247-233-225.buffalo.res.rr.com (cpe-67-247-233-225.buffalo.res.rr.com [67.247.233.225])

by lab.anonymous.edu.tw (Postfix) with SMTP id B689650F3DE

for <chen@lab.anonymous.edu.tw>; Fri, 15 Oct 2010 07:12:59 +0800 (CST)

Received: from dns625.gmail.com (eimibkbzm.gmail.com [210.63.128.223]) by 67.247.233.225 with Microsoft SMTPSVC(5.0.2195.6824);

Mon, 25 Oct 2010 21:06:47 -0200

Received: from dns7.gmail.com ([210.67.45.171])

by 211.76.180.92 with SMTP id mwo676XW02GVxg6

for <jkvapa@gmail.com>; Mon, 25 Oct 2010 17:12:47 -0600

 

From this example, the mail is delivered via the following hosts in order:

  1. 1. 210.67.45.171
  2. 2. 211.76.180.92 / 210.63.128.223
  3. 3. 67.247.233.225
  4. 4. lab.anonymous.edu.tw (local mail server)
  5. 5. aboj.cc (local mail server)

As you can see, the second hop is mismatched in the mail header. From the last Received record, it is received by 211.76.180.92. However, from the next record, it is received from 210.63.128.223. This indicates that the last record is forged by spammer. Nevertheless, in order to proceed, a program is developed to retrieve the IP with the following pattern:

Received: from host.domain.example.com ([a.b.c.d])

The example above would then have the following IPs after excluding local servers:

  1. 1. 210.67.45.171

2. 210.63.128.223

3. 67.247.233.225

 

Figure 1. The link graph of the hosts of delivery

 

The result is shown in Figure 1. It proves the assumption wrong. These hosts are barely related to each other. From the assumption, we expect the graph to be like the one in Figure 2, but with more complexity and a larger scale.

 

Figure 2. A subset of figure 1. Hosts are linked.

SPAM via free discussion groups

After failing from the visualization approach, we select SPAM mails randomly, and inspect the header. The following is one type of SPAM that is spread through open discussion platforms.

 

Return-Path: <sentto-67566559-124-1288758846-chen=lab.anonymous.edu.tw@returns.groups.yahoo.com>

X-Original-To: chen@aboj.cc

Delivered-To: chen@aboj.cc

Received: from 127.0.0.1 (localhost [127.0.0.1])

by aboj.cc (Postfix) with SMTP id A81376D447

for <chen@aboj.cc>; Wed,  3 Nov 2010 12:34:21 +0800 (CST)

X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on aboj.cc

X-Spam-Level: **

X-Spam-Status: No, score=2.8 required=8.5 tests=BAYES_40,DKIM_ADSP_CUSTOM_MED,

DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_BRBL_LASTEXT,

TRACKER_ID,T_TO_NO_BRKTS_FREEMAIL autolearn=no version=3.3.1

X-Spam-Flag: NO

Received: from aboj.cc (localhost [127.0.0.1])

by aboj.cc (Postfix) with ESMTP id DFBB76D46E

for <chen@aboj.cc>; Wed,  3 Nov 2010 12:34:15 +0800 (CST)

Received: from lab.anonymous.edu.tw (lab.anonymous.edu.tw [192.168.155.134])

by aboj.cc (Postfix) with ESMTP id C1FD56D468

for <chen@aboj.cc>; Wed,  3 Nov 2010 12:34:15 +0800 (CST)

Received: by lab.anonymous.edu.tw (Postfix)

id 4A2EF50F3DE; Wed,  3 Nov 2010 12:33:21 +0800 (CST)

Delivered-To: chen@lab.anonymous.edu.tw

Received: from n60c.bullet.mail.sp1.yahoo.com (n60c.bullet.mail.sp1.yahoo.com [98.136.45.59])

by lab.anonymous.edu.tw (Postfix) with SMTP id D9F4450F3DE

for <chen@lab.anonymous.edu.tw>; Wed,  3 Nov 2010 12:33:14 +0800 (CST)

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoogroups.com; s=lima; t=1288758848; bh=tst+SuMOY+NdqdhC+kHhs0B/uheF4KLQ8TL4TJBvv4o=; h=Received:Received:X-Yahoo-Newman-Id:X-Sender:X-Apparently-To:X-Received:X-Received:X-Received:X-Received:X-Received:To:Message-ID:User-Agent:X-Mailer:X-Originating-IP:X-Yahoo-Post-IP:From:X-Yahoo-Profile:Sender:MIME-Version:Mailing-List:Delivered-To:List-Id:Precedence:List-Unsubscribe:Date:Subject:Reply-To:X-Yahoo-Newman-Property:Content-Type; b=fEkytnSL9d3bgM21RaxQtDKwi5YMjrSdvlFfcpG/i0bZh6/i6Mj1P8NhudPxKNpG4XTTSuof2PjfPGtOQUDN0H0hRDBBjgpHCKwCk7f0+m0pV3w/aWjPlbIgkMr9tmLr

DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=lima; d=yahoogroups.com;

b=Qtayjjvum0ehrG+BaBk6fSCeHCWjPueSewifsbJE+XL8uhtUceXN8j6/N/3HihboZLOV0pGJDxEx/SFl3bWo04/NcxD8aLtyLuPRqrJWHxo41FGHrZnAR9aWwdiEdkvr;

Received: from [69.147.65.149] by n60.bullet.mail.sp1.yahoo.com with NNFMP; 03 Nov 2010 04:34:08 -0000

Received: from [66.196.95.33] by t9.bullet.mail.sp1.yahoo.com with NNFMP; 03 Nov 2010 04:34:08 -0000

X-Yahoo-Newman-Id: 67566559-m124

X-Sender: mathewrogerdavidson@yahoo.com.tw

X-Apparently-To: rjfou@yahoogroups.com

X-Received: (qmail 1347 invoked from network); 3 Nov 2010 04:34:00 -0000

X-Received: from unknown (66.196.94.107)

by m16.grp.re1.yahoo.com with QMQP; 3 Nov 2010 04:34:00 -0000

X-Received: from unknown (HELO n41b.bullet.mail.sp1.yahoo.com) (66.163.168.155)

by mta3.grp.re1.yahoo.com with SMTP; 3 Nov 2010 04:34:00 -0000

X-Received: from [69.147.65.173] by n41.bullet.mail.sp1.yahoo.com with NNFMP; 03 Nov 2010 04:33:59 -0000

X-Received: from [98.137.34.73] by t15.bullet.mail.sp1.yahoo.com with NNFMP; 03 Nov 2010 04:33:59 -0000

To: rjfou@yahoogroups.com

Message-ID: <iaqonm+3gks@eGroups.com>

User-Agent: eGroups-EW/0.82

X-Mailer: Yahoo Groups Message Poster

X-Originating-IP: 66.163.168.155

X-Yahoo-Post-IP: 118.166.211.198

From: "mathewrogerdavidson" <mathewrogerdavidson@yahoo.com.tw>

X-Yahoo-Profile: mathewrogerdavidson

Sender: rjfou@yahoogroups.com

MIME-Version: 1.0

Mailing-List: list rjfou@yahoogroups.com; contact rjfou-owner@yahoogroups.com

Delivered-To: mailing list rjfou@yahoogroups.com

List-Id: <rjfou.yahoogroups.com>

Precedence: bulk

List-Unsubscribe: <mailto:rjfou-unsubscribe@yahoogroups.com>

Date: Wed, 03 Nov 2010 04:33:58 -0000

Subject: [rjfou] =?big5?B?vejAdaFBu/m3R6FBqqus/KFBqkGwyKZuoUHF/bF6?=

=?big5?B?tlKquqnxpN+hQaXOqrq2faTfIXBqampjM3g=?=

Reply-To: rjfou@yahoogroups.com

X-Yahoo-Newman-Property: groups-email-ff-u

Content-Type: multipart/alternative;

boundary="1-0389514698-4772064581=:3"

MailScanner-NULL-Check: 1289363597.16652@3vCCEr+iRxTgoYKSuQQnOg

X-WMLAB-MailScanner-Information: Please contact postmaster@lab.anonymous.edu.tw for more information

X-MailScanner-ID: D9F4450F3DE.12AD0

X-WMLAB-MailScanner: Found to be clean

X-WMLAB-MailScanner-SpamCheck: not spam, SpamAssassin (not cached,

score=8.692, required 9, BAYES_99 3.50, FH_DATE_PAST_20XX 3.19,

HTML_MESSAGE 0.00, TRACKER_ID 2.00)

X-WMLAB-MailScanner-SpamScore: ssssssss

X-WMLAB-MailScanner-From: sentto-67566559-124-1288758846-chen=lab.anonymous.edu.tw@returns.groups.yahoo.com

X-Virus-Scanned: ClamAV using ClamSMTP

 

--1-0389514698-4772064581=:3

 

The SPAM itself is posted inside the discussion group. For the above instance, SPAM mails can be found after a visit to yahoo groups “rjfou.” In the SPAMs inside this group, the link of the ad is pointed to a short URL service in Japan, thus, it would be hard to investigate further since it is an easy-access service by any user, and the usage logs are preserved in the service provider.

Moreover, we could not do anything from the delivery hosts because this mail is generated by other company’s discussion groups, and banning it might cause mails from normal discussion groups to be dropped.

 

SPAM from open proxies

After finding the characteristics of SPAM mails from free discussion groups, we focused on another type of SPAM mails. From the previous type, these mails pass the heuristic content checking since they are sent from normal mailers, and they all followed RFC. The other type of SPAM mails are already tagged as SPAM. The following is an example:

Return-Path: <arvjqnatho@com.tw>

X-Original-To: chen@aboj.cc

Delivered-To: chen@aboj.cc

Received: from 127.0.0.1 (localhost [127.0.0.1])

by aboj.cc (Postfix) with SMTP id 7014B6D458

for <chen@aboj.cc>; Wed,  3 Nov 2010 13:15:33 +0800 (CST)

X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on aboj.cc

X-Spam-Level: ****************************

X-Spam-Status: Yes, score=28.1 required=8.5 tests=BAYES_99,DATE_IN_FUTURE_96_Q,

FORGED_MUA_THEBAT_BOUN,HK_RANDOM_ENVFROM,HK_RANDOM_FROM,

HTML_FONT_LOW_CONTRAST,HTML_FONT_SIZE_HUGE,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,

INVALID_MSGID,MIME_BOUND_DD_DIGITS,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,

MIME_QP_LONG_LINE,MISSING_MIMEOLE,MPART_ALT_DIFF,MSGID_SHORT,

NO_RDNS_DOTCOM_HELO,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,

RAZOR2_CHECK,RCVD_IN_BRBL_LASTEXT,SUBJECT_NEEDS_ENCODING,SUBJ_ALL_CAPS,

SUBJ_ILLEGAL_CHARS,URIBL_BLACK autolearn=spam version=3.3.1

X-Spam-Report:

*  1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist

*      [URIs: cb0yb3.com]

*  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%

*      [score: 1.0000]

*  1.4 MIME_BOUND_DD_DIGITS Spam tool pattern in MIME boundary

*  0.0 MSGID_SHORT Message-ID is unusually short

*  2.4 HK_RANDOM_FROM From username looks random

*  0.0 HK_RANDOM_ENVFROM Envelope sender username looks random

*  3.2 DATE_IN_FUTURE_96_Q Date: is 4 days to 4 months after Received: date

*  0.8 NO_RDNS_DOTCOM_HELO Host HELO'd as a big ISP, but had no rDNS

*  1.5 SUBJ_ALL_CAPS Subject is all capitals

*  0.0 HTML_MESSAGE BODY: HTML included in message

*  0.0 HTML_FONT_SIZE_HUGE BODY: HTML font size is huge

*  0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar to background

*  0.8 MPART_ALT_DIFF BODY: HTML and text parts are different

*  0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts

*  0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars

*  1.9 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level

*      above 50%

*      [cf: 100]

*  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%

*      [cf: 100]

*  0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)

*  1.4 RCVD_IN_BRBL_LASTEXT RBL: RCVD_IN_BRBL_LASTEXT

*      [192.168.155.134 listed in bb.barracudacentral.org]

*  0.4 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag

*  1.0 SUBJ_ILLEGAL_CHARS Subject: has too many raw illegal characters

*  0.0 SUBJECT_NEEDS_ENCODING SUBJECT_NEEDS_ENCODING

*  3.4 FORGED_MUA_THEBAT_BOUN Mail pretending to be from The Bat!

*      (boundary)

*  0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts

*  1.9 MISSING_MIMEOLE Message has X-MSMail-Priority, but no X-MimeOLE

*  0.6 INVALID_MSGID Message-Id is not valid, according to RFC 2822

X-Spam-Flag: YES

Received: from aboj.cc (localhost [127.0.0.1])

by aboj.cc (Postfix) with ESMTP id BD3056D447

for <chen@aboj.cc>; Wed,  3 Nov 2010 13:15:28 +0800 (CST)

Received: from lab.anonymous.edu.tw (lab.anonymous.edu.tw [192.168.155.134])

by aboj.cc (Postfix) with ESMTP id 9B7276D42D

for <chen@aboj.cc>; Wed,  3 Nov 2010 13:15:28 +0800 (CST)

Received: by lab.anonymous.edu.tw (Postfix)

id 3CCF950F3DE; Wed,  3 Nov 2010 13:14:34 +0800 (CST)

Delivered-To: chen@lab.anonymous.edu.tw

Received: from YOUNGGIE (unknown [211.119.105.142])

by lab.anonymous.edu.tw (Postfix) with SMTP id 2F3AD50F3DE

for <chen@lab.anonymous.edu.tw>; Wed,  3 Nov 2010 13:14:27 +0800 (CST)

Received: from dns088.aol.com ([61.56.116.180])

by 203.71.212.142 with SMTP id n09VU1GSTty3504

for <qhazdconi@aol.com>; Mon, 15 Nov 2010 03:12:51 -0200

Message-ID: <tozxr3gtat1>

From: "顏振正" <arvjqnatho@com.tw>

Reply-To: "顏振正" <arvjqnatho@com.tw>

To: chen@lab.anonymous.edu.tw

Subject: [SPAM] CHANEL沒有想像中那.麼難擁有 GOGO

Date: Mon, 15 Nov 2010 08:12:51 +0300

X-Mailer: The Bat! (v1.52f) Business

MIME-Version: 1.0

Content-Type: multipart/alternative;

boundary="--70279461632447869"

X-Priority: 3

X-MSMail-Priority: Normal

MailScanner-NULL-Check: 1289366069.95926@gLwq4DitY/8TdOo1KwhBGg

X-WMLAB-MailScanner-Information: Please contact postmaster@lab.anonymous.edu.tw for more information

X-MailScanner-ID: 2F3AD50F3DE.494D6

X-WMLAB-MailScanner: Found to be clean

X-WMLAB-MailScanner-SpamCheck: spam, SpamAssassin (not cached, score=30.675,

required 9, autolearn=spam, BAYES_99 3.50, DATE_IN_FUTURE_96_XX 1.44,

FH_DATE_PAST_20XX 3.19, FORGED_MUA_THEBAT_BOUN 1.25,

FORGED_THEBAT_HTML 3.20, HTML_FONT_LOW_CONTRAST 0.12,

HTML_FONT_SIZE_HUGE 0.06, HTML_MESSAGE 0.00,

HTML_MIME_NO_HTML_TAG 0.10, INVALID_MSGID 1.90,

MIME_BOUND_DD_DIGITS 1.47, MIME_HTML_ONLY 1.46,

MIME_HTML_ONLY_MULTI 0.00, MIME_QP_LONG_LINE 1.40,

MISSING_MIMEOLE 0.00, MPART_ALT_DIFF 0.74, MSGID_SHORT 1.08,

NO_RDNS_DOTCOM_HELO 0.00, RCVD_IN_XBL 3.03, RDNS_NONE 0.10,

SUBJECT_NEEDS_ENCODING 0.00, SUBJ_ALL_CAPS 2.08,

SUBJ_ILLEGAL_CHARS 1.00, TVD_RCVD_SINGLE 1.35, TVD_SPACE_RATIO 2.22)

X-WMLAB-MailScanner-SpamScore: ssssssssssssssssssssssssssssss

X-WMLAB-MailScanner-From: arvjqnatho@com.tw

X-Virus-Scanned: ClamAV using ClamSMTP

 

----70279461632447869

Content-Type: text/html; charset="big5"

Content-Transfer-Encoding: quoted-printable

 

<head>

</head>

<p>=A1@</p>

<p style=3D"margin-top: 5px; margin-bottom: 0"><b><a href=3D"http://ksszz.=

cb0yb3.com"><font size=3D"6" color=3D"#F0AC0F">=A6A=A4=A3=B6R OMEGA=BF=F6=B5=

=B9=A6=D1=A4=BD=B4N=ADn'=C2=F7=B1B=A4F</font></a></b></p>

 

<p>=A1@</p>

<p>=A1@</p>

<p style=3D"margin-top: 5px; margin-bottom: 0"><b><a href=3D"http://keq.cb=

0yb3.com"><font size=3D"8" color=3D"#ffff66">=A6A=A4=A3=B6RLV=B5=B9=A6=D1=B1=

C.=AA=D6=A9w=B3Q=C2d=BA=E2=BDL</font></a></b></p>

<p>=A1@</p>

<p>=A1@</p>

<p style=3D"margin-top: 5px; margin-bottom: 0"><b><a href=3D"http://zlkn.c=

b0yb3.com"><font size=3D"7" color=3D"#ff3366">=ABz=BEaAP=B7R=A9=BC=BF=F6=A9=

~'=B5M=A5u=ADn2300 ???</font></a></b></p>

 

</body>

 

 

----70279461632447869--

 

 

 

There are several interesting things for this type. First of all, some of the headers are not real. This is the same phenomenon of the header example in the Visualization part of this article. The first hop from the “Received:” header indicates it is sent from 61.56.116.180  203.71.212.142. However, from the second hop, it tells that it is sent from 211.119.105.142  lab.anonymouse.edu.tw. These records do not match because the recipient of the first hop should be the same as the sender of the second hop. Therefore, the first one created by this e-mail MUA forged this record.

The second interesting thing is that the X-Mailer header “The Bat! (v1.52f) Business” is a signature of a known spamming Trojan behavior (F-Secure, 2005). The Trojan is found in 2005 and still in the wild for now. However, it should be a variant since some other signatures described in this Trojan don’t fit.

To take a step further, it might be interesting to dig into more about the IP which delivered the mail to the server. It may be part of the botnet. After a very simple port scan of this IP, a SOCKS proxy port is found.

Starting Nmap 5.21 ( http://nmap.org ) at 2010-11-03 23:40 CST

Nmap scan report for 211.119.105.142

Host is up (0.072s latency).

Not shown: 990 closed ports

PORT     STATE    SERVICE

21/tcp   open     ftp

25/tcp   open     smtp

110/tcp  open     pop3

119/tcp  open     nntp

135/tcp  filtered msrpc

445/tcp  filtered microsoft-ds

1080/tcp open     socks

3389/tcp open     ms-term-serv

4444/tcp filtered krb524

6004/tcp open     X11:4

 

Nmap done: 1 IP address (1 host up) scanned in 2.20 seconds

 

Several SPAM mails are investigated in the same way, and open proxies are found in most cases, no matter it is a HTTP proxy on 8080 or 3128 port, or other types of proxy. This indicates that these are intermediate sites.

The SOCKS proxy in the above example is really working. After pointing the SOCKS server to 211.119.105.142, an e-mail is sent to the mail server, and the maillog confirmed that the connection was initiated from this address.

In order to get more information of these open proxies, the program developed for the visualization was revised. For each IP which was the immediate IP that was connected to the mail server, a testing step is performed. It tested whether common proxy port were open.

#!/opt/bin/python

import os

import re

import sys

import socket

 

if len(sys.argv) == 2:

dirname = sys.argv[1]

else:

dirname = 'pool'

 

f = open('output', 'w')

 

def testproxy(ip):

global f

print "%s" % ip,

port = [3128, 8080, 1080]

 

open = False

for p in port:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

try:

s.settimeout(2)

s.connect((ip, p))

open = p

except socket.error, msg:

pass

finally:

s.close()

 

if open != False:

print " open on %d" % open

else:

print ""

 

 

def rec(file):

m = re.compile('Received: from ([^ ]+) \(([^)]+)\)')

mip = re.compile('\[([^ ]+)\]')

f = open(file)

stack = []

for line in f:

line = line.strip()

result = m.search(line)

if result != None:

ipresult = mip.search(result.group(2))

if ipresult != None:

ip = ipresult.group(1)

stack.append('%s' % ip)

 

f.close()

 

next = False

for ip in stack:

if next == True:

testproxy(ip)

break

 

if ip == ' 192.168.155.134':

next = True

 

try:

files = os.listdir(dirname)

for file in files:

rec(dirname + '/' + file)

except Exception as (errno, strerror):

print strerror

 

This program is run against 49 SPAM mails from November 1st2010 00:47:36 (GMT+8) to November 3rd2010 01:36:10 (GMT +8). Only 3 of the immediate IPs connected to the mail server are duplicated. And, surprisingly, 32 out of 46 unique IPs have ports open on possible proxy ports (1080, 3128, 8080), which is a 69.56% hit rate. Moreover, 8 of the IPs did not respond to ICMP packets, which were supposedly down at the time. Therefore, for the hosts that are still alive for the time being, retrieving the IP from SPAM mails gives an 84.21% rate of finding open proxy.

The result is listed below. The data includes all 49 SPAM mails. Duplicated hosts are grayed-out.

115.115.45.12

115.252.131.206

118.113.42.104

118.97.27.82  open on 8080

119.177.15.238  open on 8080

174.123.38.122

186.215.106.50  open on 3128

186.235.129.114  open on 3128

187.38.2.160  open on 8080

187.58.151.234  open on 8080

190.145.8.235  open on 8080

190.240.25.19  open on 8080

190.244.153.227  open on 8080

190.85.7.75  open on 8080

192.121.145.90

192.121.145.90

196.35.113.34  open on 3128

200.244.28.94  open on 3128

201.20.64.52  open on 3128

201.22.57.116  open on 3128

201.47.100.215  open on 3128

203.110.203.71  open on 3128

203.124.57.230  open on 1080

209.45.74.195

211.147.3.75  open on 3128

211.147.3.75  open on 3128

217.76.204.196  open on 3128

218.25.99.135  open on 1080

218.25.99.135  open on 1080

218.58.227.196

219.239.98.210

220.225.75.205

221.145.70.168

58.17.150.200  open on 1080

59.176.128.206  open on 3128

60.191.49.123  open on 3128

64.201.136.242

64.87.57.243  open on 3128

67.216.82.114

67.247.233.225  open on 8080

78.39.53.108  open on 3128

79.125.15.166

80.75.8.115  open on 1080

82.165.138.100  open on 3128

85.174.231.86  open on 1080

93.184.44.80  open on 3128

94.138.36.10  open on 8080

95.172.50.179

95.31.208.227  open on 3128

 

Proxytest Content Filter Prototype

Based on the observation, the author developed a prototype program to improve banning SPAMs. The prototype is called “Proxytest”, and it is a postfix content filter derived from the program shown above. The author’s local mail server already has two content filter, clamsmtp for ClamAV scanning, and spamfilter for SpamAssassin in order. Therefore, we are plugging it after spamfilter.

The source code of Proxytest can be found at http://code.google.com/p/proxytest/.

 

How it works

When a mail is coming inside the mail server, it will first be checked by ClamAV. If it passes, it will then be checked by the heuristic content program, SpamAssassin. After that, the mail will be passed to “Proxytest.” Figure 3 shows the flow of Proxytest.

 

Figure 3. The flow of Proxytest

 

This program only focuses on mails that are tagged as SPAM. All mails that pass SpamAssassin’s check also pass Proxytest. In order not to perform the proxy port test all the time, a blacklisting mechanism is built. It is simply a text file with one IP per line. A whitelisting mechanism is also developed, but is not used currently. The proxy port test is a very simple procedure that tests whether 1080, 3128 or 8080 port is opened on the connecting host. It did not do any further examination. Therefore, the IP addresses collected are only assumed to be open proxies.

 

Result

After putting the plugin on-line for 12 hours, there were total 85 mails handled by the mail server. 32 mails were tagged as SPAM by SpamAssassin. 5 of the 32 passed the test from proxytest, leaving 27 blocked, which is an 84.32% hit rate. The blacklist.txt is like a cache, IP addresses that were assumed to be open proxies go to this list. IP addresses that are in the blacklist are directly blocked before performing the test again. There were 24 entries in the blacklist.txt, indicating that only 3 addresses were banned from the blacklist, and 24 unique hosts had their common proxy port opened. The IP addresses are shown below.

202.47.224.67

212.68.55.67

60.14.97.38

91.193.22.92

125.243.9.55

212.13.172.53

202.115.7.51

211.147.3.74

184.106.189.175

182.71.3.154

203.253.25.48

211.147.3.75

58.242.248.15

212.34.250.66

62.215.5.69

220.69.24.12

201.86.129.43

190.82.68.186

203.86.2.57

112.121.190.138

74.117.158.202

187.6.48.35

85.18.116.26

201.231.54.53

 

After the 12 hour test, we extend the test to 9 days (from 0:00 Jan. 7th, 2011 to 23:59 Jan. 15th), and we got 1272 mails in total. 374 mails were tagged as SPAM mails, and 65 of them passed the proxytest. The remaining 309 were banned, and the hit rate is 82.62%. 231 of the 309 banned were assumed to be open-proxies and were put in the blacklist. Only 78 mails were directly dropped without testing the proxy ports.

As a result, we can see duplicated addresses are low and thus the blacklist hit rate is low. This reason might be the collecting period of time is relatively short. This might change after running the system for a while.

This method successfully found 24 and 231 different proxies in 12 hours and 9 days respectively. Although heuristic content checking already tagged them as SPAM mails, we could still improve anti-spamming by blocking the IP before even passing to the content filter.

 

Conclusion

The author investigated SPAM mails of his own personal mailbox, and found that lots of the mails tagged as SPAM were delivered via open proxies. A simple checking program is developed and plugged into the author’s local mail server, and it had successfully blocked several SPAM mails.

Not only these mails are blocked, a list of open proxies is also obtained by the program. These proxies are actually functional at the time and further reaction can be done, such as submitting the open proxies to well-known DNSBL provider, or start our own DNSBL. The program could be extended to perform this process automatically.

There are some similar SPAM researches (Steding-Jessen, K., Vijaykumar N. L., & Montes A. 2008) that reveal the same conclusion to this SPAM proxy finding. From the honeypot of the research, most SPAMs are coming from Taiwan with local ISP TFN, Hinet and Seednet. Calais et al., (2009) show that the entire spamming botnet has a very complex connection between the victims and the open-proxies. The method from this article only touches the open-proxy and the spammed mailboxes. A future work is to follow the honeypot method or even investigate more on the victims that connected to the open-proxies.

References

Templeton, B. (n.d.). Reaction to the DEC Spam of 1978 [WWW Page]. URL http://www.templetons.com/brad/spamreact.html, http://en.wikipedia.org/wiki/E-mail_spam

Marty, R. (2008). Applied Security Visualization. Boston: Pearson Education.

F-Secure. (2005). F-Secure Virus Descriptions: Delf.h [WWW Page]. URL http://www.f-secure.com/v-descs/lonbomb.shtml

Steding-Jessen, K., Vijaykumar N. L., & Montes A. (2008). Using low-interaction honeypots to study the abuse of open proxies to send spam. INFOCOMP Journal of Computer Science

Calais, P. H., Guedes, D., Wagner Meira, J., Hoepers, C., Chaves, M. H. P. C., & Steding-Jessen, K. (2009). Spamming chains: A new way of understanding spammer behavior. In Proceedings of the 6th Conference on e-mail and anti-spam (CEAS)

 

[rokdownload menuitem="439" downloaditem="6" direct_download="true"]Click here to Download this Article[/rokdownload]

Viewing all articles
Browse latest Browse all 10

Trending Articles