How to trace an attack from Tor network?
#1
Posted 14 December 2005 - 10:42 AM
Thanks,
Jack
#2
Posted 14 December 2005 - 11:35 AM
Hello, my company webserver is getting continuous attacks from someone which I discovered using a node of Tor network. Is it possible to retrieve any more details on him?
Thanks,
Jack
No (unless he's paticularly dumb and has compromised himself)
http://tor.eff.org/f...en#TracingUsers
I have a compelling reason to trace a Tor user. Can you help?
There is nothing the Tor developers can do to trace Tor users. The same protections that keep bad people from breaking Tor's anonymity also prevent us from figuring out what's going on.
Some fans have suggested that we redesign Tor to include a backdoor. There are two problems with this idea. First, it technically weakens the system too far. Having a central way to link users to their activities is a gaping hole for all sorts of attackers; and the policy mechanisms needed to ensure correct handling of this responsibility are enormous and unsolved. Second, the bad people aren't going to get caught by this anyway, since they will use other means to ensure their anonymity (identity theft, compromising computers and using them as bounce points, etc).
But remember that this doesn't mean that Tor is invulnerable. Traditional police techniques can still be very effective against Tor, such as interviewing suspects, surveillance and keyboard taps, writing style analysis, sting operations, and other physical investigations.
Edited by tiocsti, 14 December 2005 - 11:37 AM.
#4
Posted 19 December 2005 - 05:12 AM
They publish a list of open Tor servers, so just block them.
If everyone did this it would render Tor unusable for people who legitimately want to protect their privacy, which would be a shame.
Tor's allready banned from a lot of IRC sites, which sucks.
We shouldn't be telling him to do this, we should be advocting installing better security, surely? Informing the police too, if actual damage occurs.
What sorts of things are they doing with Tor? Are you really at that great a risk? Couldn't you employ some sort of temporary ban for an IP address if you get so many failed logins, etc?
#5
Posted 19 December 2005 - 05:40 AM
They publish a list of open Tor servers, so just block them.
If everyone did this it would render Tor unusable for people who legitimately want to protect their privacy, which would be a shame.
Tor's allready banned from a lot of IRC sites, which sucks.
We shouldn't be telling him to do this, we should be advocting installing better security, surely? Informing the police too, if actual damage occurs.
What sorts of things are they doing with Tor? Are you really at that great a risk? Couldn't you employ some sort of temporary ban for an IP address if you get so many failed logins, etc?
That's sort of the problem. Tor can use any ip at the cloud's disposal and the end user (the one using/attacking) randomly picks up new ip's within the cloud on a semi-random basis. So if it was coming from one address, 30 seconds to a couple minutes later, it's coming from another.
Tor really is a double edged sword. The EFF is hoping to legal wager it against P2P networks (technically by design it's p2p) in providing a privacy space. Personally, in the long run, the government (at least in the US) would go ape over this sort of obscurity.
But I look at it like how we hacked back in the day. Some 20 years ago, we would trade user accounts to obscure our tracks through the network. You figure if you bounce through so many machines that it would take a bunch of time for the system admin getting the shaft to contact each sysop up the chain of the trace hoping they kept access logs or coordinating the trace as each sysadmin would have to do a 'who -l' (if they didn't have finger enabled) to traceback where they entered the internet from.
Tor is just a automated evolution of this sort of network hopping. The fact that you can use it for IRC just adds to that. If I understand the architecture right, you can use any program/protocol that you can sockify into the Tor cloud and zip you no longer come from your home IP but someone else's on the cloud.
But how do you trace it?
It's funny how technology works. Since computers are dumb machines that we make intelligent, the answer is a rather dumb solution.
>>> Add more TOR servers to the cloud. <<<
If you're running TOR while connected to the cloud and you pull a netstat, you'll see every connection coming from your machine. Obviously they may be bouncing from another but let's just think about this.
The system randomly changes every minute or two the path for the egress system. Say I'm system admin for xyz corp and I want to capture who's been putting dirty pictures of me and my wife on the corporate page. I create a bunch of tor servers and begin logging all the traffic (ala carnivore stylez). What I'm interested in is the addresses. Obviously the destination is my web site and the source will directly correspond to the amount of traffic generated to the destination address on the cloud.
Then the law of averages applies. The chances of me obtaining his true IP will steadily increase over time as his packets travel through the cloud and eventually route through my TOR cluster. I've even seen TOR reroute through myself from time to time. So you know that it's only a matter of time. Same could be said about DoS attacks. It has to originate somewhere. This is a perfect example where Tor != privacy. Results may vary.
-jf
#6
Posted 19 December 2005 - 06:19 AM
But how do you trace it?
It's funny how technology works. Since computers are dumb machines that we make intelligent, the answer is a rather dumb solution.
>>> Add more TOR servers to the cloud. <<<
If you're running TOR while connected to the cloud and you pull a netstat, you'll see every connection coming from your machine. Obviously they may be bouncing from another but let's just think about this.
The system randomly changes every minute or two the path for the egress system. Say I'm system admin for xyz corp and I want to capture who's been putting dirty pictures of me and my wife on the corporate page. I create a bunch of tor servers and begin logging all the traffic (ala carnivore stylez). What I'm interested in is the addresses. Obviously the destination is my web site and the source will directly correspond to the amount of traffic generated to the destination address on the cloud.
Then the law of averages applies. The chances of me obtaining his true IP will steadily increase over time as his packets travel through the cloud and eventually route through my TOR cluster. I've even seen TOR reroute through myself from time to time. So you know that it's only a matter of time. Same could be said about DoS attacks. It has to originate somewhere. This is a perfect example where Tor != privacy. Results may vary.
-jf
No, I don't think that would work.
http://tor.eff.org/overview.html.en
Here's my understanding of how a Tor connection is established:
1. client gets server list.
2. picks exit node.
3. creates encyrpted tunnel to exit node (but never directly).
4. uses it.
The critical point being: only the exit node can decrypt and view the packets, right? Correct me if I'm wrong but I think:
1) only the exit node can decrypt the packets and
2) the exit node has no idea where the packets came from, it only knows what the last link in the chain was.
So, I don't think you could ever trace back the link.
Edited by coding_monkey, 19 December 2005 - 06:30 AM.
#7
Posted 19 December 2005 - 08:09 AM
But how do you trace it?
It's funny how technology works. Since computers are dumb machines that we make intelligent, the answer is a rather dumb solution.
>>> Add more TOR servers to the cloud. <<<
If you're running TOR while connected to the cloud and you pull a netstat, you'll see every connection coming from your machine. Obviously they may be bouncing from another but let's just think about this.
The system randomly changes every minute or two the path for the egress system. Say I'm system admin for xyz corp and I want to capture who's been putting dirty pictures of me and my wife on the corporate page. I create a bunch of tor servers and begin logging all the traffic (ala carnivore stylez). What I'm interested in is the addresses. Obviously the destination is my web site and the source will directly correspond to the amount of traffic generated to the destination address on the cloud.
Then the law of averages applies. The chances of me obtaining his true IP will steadily increase over time as his packets travel through the cloud and eventually route through my TOR cluster. I've even seen TOR reroute through myself from time to time. So you know that it's only a matter of time. Same could be said about DoS attacks. It has to originate somewhere. This is a perfect example where Tor != privacy. Results may vary.
-jf
No, I don't think that would work.
http://tor.eff.org/overview.html.en
Here's my understanding of how a Tor connection is established:
1. client gets server list.
2. picks exit node.
3. creates encyrpted tunnel to exit node (but never directly).
4. uses it.
The critical point being: only the exit node can decrypt and view the packets, right? Correct me if I'm wrong but I think:
1) only the exit node can decrypt the packets and
2) the exit node has no idea where the packets came from, it only knows what the last link in the chain was.
So, I don't think you could ever trace back the link.
Sure you can't sniff the wire. But you don't care about content, you care about IP addresses. I've seen it connect through tor to myself so I don't necessarily buy #3. In any case, the law of averages would win out because if I'm tracking all the incoming and outgoing connections on my TOR cluster (TOR does not encrypt IP's or DNS resolution at the OS level) then I could theoretically write a script that logs to a database all the connection traffic then query against the DB to rank the highest number of connections to a certain address based on traffic alone. Obviously this wouldn't work on the Google, Yahoo, MSN's in the world but then you factor page pull time into it and you could narrow it down quite a bit.
You mention that the exit node is the only one that can decrypt in the clear. That's because it needs to for access to the public internet. However all the nodes in between irregardless are exposing IP information so the packets are routed properly between the nodes (hence OS level logging) before it hits the exit node. You may not know what it contains but you do know that someone from point a is sending something to point e via b c and d where you are c and can see that b and d are talking. This would be a packet that obviously wouldn't matter much other than increasing the score that at some point a or e will show up on your tor and raise the score even higher. It's all in the averages and probability.
At least you'd have better odds than winning the lottery.
-jf
#8
Posted 19 December 2005 - 08:27 AM
To create a private network pathway with Tor, the user's software or client incrementally builds a circuit of encrypted connections through servers on the network. The circuit is extended one hop at a time, and each server along the way knows only which server gave it data and which server it is giving data to. No individual server ever knows the complete path that a data packet has taken. The client negotiates a separate set of encryption keys for each hop along the circuit to ensure that each hop can't trace these connections as they pass through.
from http://tor.eff.org/overview.html.en
---
So what it's doing is making a encrypted circuit between the source and destination nodes based on the random server list it obtains when it joins the network by making connection after connection through the cloud sort of like a boring machine through a mountain.
Note: If you are the individual server in between, you may not know the complete path but you sure as hell know that you're sitting in a tunnel (from the analogy of a rock in the middle of the mountain tunnel).
The last part where it negotiates a separate set of encryption keys for each hop only refers to the circuit encryption not the actual IP based routing so one cannot sniff the circuit looking for routing information.
What I speak of is basically SIGINT (taking a military term) where places like Fort Huachuca in Arizona and US Navy ships monitor a given country's communications traffic and based on who talks to who determines who they are and what they are talking about (sometimes they luck out and get clear traffic). Sure there's a couple more tricks to it but it's basically the same deal. (Don't you people read Tom Clancy novels or have seen 'Clear and Present Danger'?)
-jf
#9
Posted 19 December 2005 - 08:32 AM
A--->B--->C (really simple tor connection where A is the guy who's trying to hack your site, B is a middle man node and C is the exit node).
How would C even know about the existance of A? There is *nothing* in any of the data that goes from B to C to suggest that A is even the client of the request, for all C knows, B could be the client. All C does is route the response back to B then it's up to B to either consume it or pass it on. If it passes it on, it only knows the next link to pass it on to, so it doesn't know whether A is going to consume it or just pass it on.
So how do you think you can trace anyone? Maybe I'm missing something but I'd be interested to know if there is a way, so would the guys at the Tor project, I bet
#10
Posted 19 December 2005 - 08:42 AM
Sure you can't sniff the wire. But you don't care about content, you care about IP addresses. I've seen it connect through tor to myself so I don't necessarily buy #3. In any case, the law of averages would win out because if I'm tracking all the incoming and outgoing connections on my TOR cluster (TOR does not encrypt IP's or DNS resolution at the OS level) then I could theoretically write a script that logs to a database all the connection traffic then query against the DB to rank the highest number of connections to a certain address based on traffic alone. Obviously this wouldn't work on the Google, Yahoo, MSN's in the world but then you factor page pull time into it and you could narrow it down quite a bit.
You mention that the exit node is the only one that can decrypt in the clear. That's because it needs to for access to the public internet. However all the nodes in between irregardless are exposing IP information so the packets are routed properly between the nodes (hence OS level logging) before it hits the exit node. You may not know what it contains but you do know that someone from point a is sending something to point e via b c and d where you are c and can see that b and d are talking. This would be a packet that obviously wouldn't matter much other than increasing the score that at some point a or e will show up on your tor and raise the score even higher. It's all in the averages and probability.
At least you'd have better odds than winning the lottery.
-jf
Tor works like this:
1: Gets server list
2: generates path
3: encrypts each element through path
Now, lets say you are the exit node, you will know
1) Content of the message
2) the target address
You will not know the source of session, though.
Now let's say you are the entry node, you will know:
1) Origin of message
2) Next hop address
You will not know destination address, nor its content. You just remove your layer of the onion, and pass it on to the next guy.
If you're the middle node (assuming a path of 3) you only know the previous node in the path and the next node in the path.
Your statement that you know that person a is sending information to point e via points b, c, and d is true only if you have access to the network traffic of the message's originator, none of the tor nodes have that information.
Additionally, messages are padded to make traffic analysis more difficult.
#11
Posted 19 December 2005 - 08:43 AM
I don't get it.... lets say you have:
A--->B--->C (really simple tor connection where A is the guy who's trying to hack your site, B is a middle man node and C is the exit node).
How would C even know about the existance of A? There is *nothing* in any of the data that goes from B to C to suggest that A is even the client of the request, for all C knows, B could be the client. All C does is route the response back to B then it's up to B to either consume it or pass it on. If it passes it on, it only knows the next link to pass it on to, so it doesn't know whether A is going to consume it or just pass it on.
So how do you think you can trace anyone? Maybe I'm missing something but I'd be interested to know if there is a way, so would the guys at the Tor project, I bet
The operator of site B would pull up a console and type: netstat
It would give you something that looked like this...
Active Connections
Proto Local Address Foreign Address State
TCP bridgecontrol:1805 test.datenfreihafen.de:6881 ESTABLISHED
TCP bridgecontrol:1822 d54C33485.access.telenet.be:59148 ESTABLISHED
I haven't looked that close myself as to whether or not Tor uses the same port incoming from the cloud but I'm willing to bet that's another way of narrowing down the chaos from netstat.
Anyways, your OS knows who it is talking to. By looking at this you could tell that Point A is talking to Point B (in your example) by just culling over this list.
Making sense to you now?
#12
Posted 19 December 2005 - 08:46 AM
This, of course, doesn't work when it's both.
I don't get it.... lets say you have:
A--->B--->C (really simple tor connection where A is the guy who's trying to hack your site, B is a middle man node and C is the exit node).
How would C even know about the existance of A? There is *nothing* in any of the data that goes from B to C to suggest that A is even the client of the request, for all C knows, B could be the client. All C does is route the response back to B then it's up to B to either consume it or pass it on. If it passes it on, it only knows the next link to pass it on to, so it doesn't know whether A is going to consume it or just pass it on.
So how do you think you can trace anyone? Maybe I'm missing something but I'd be interested to know if there is a way, so would the guys at the Tor project, I bet
#13
Posted 19 December 2005 - 08:46 AM
Tor works like this:
1: Gets server list
2: generates path
3: encrypts each element through path
Now, lets say you are the exit node, you will know
1) Content of the message
2) the target address
You will not know the source of session, though.
Now let's say you are the entry node, you will know:
1) Origin of message
2) Next hop address
You will not know destination address, nor its content. You just remove your layer of the onion, and pass it on to the next guy.
If you're the middle node (assuming a path of 3) you only know the previous node in the path and the next node in the path.
Your statement that you know that person a is sending information to point e via points b, c, and d is true only if you have access to the network traffic of the message's originator, none of the tor nodes have that information.
Additionally, messages are padded to make traffic analysis more difficult.
You're focusing on the content of the packet and not the envelope.
Edited by jfalcon, 19 December 2005 - 08:47 AM.
#14
Posted 19 December 2005 - 08:52 AM
As a practical matter, C can determine the likelyhood that B is the client or not by comparing its address to the list of tor servers. If it's not on the list, it's likely the originating client, if it is on the list then it's likely just a tor server.
This, of course, doesn't work when it's both.
I'm not sure it could even figure it out with the server list. Since Onion/Tor is a p2p network, everyone is a server and everyone is a leaf. Not that it wouldn't but you're going to have alot of false positives (or negatives).
#15
Posted 19 December 2005 - 08:54 AM
source > A > B > C > destination
Okay, now if you managed to circumvent tor node A, you have the address of source, and you have the address of B. You don't know the address of C or destination, nor do you know the content of the message.
If you have managed to circumvent B you know the address of A and the address of C and can determine they are both tor nodes. You dont know the source or destination or content of message.
If you have managed to circumvent C you have the content of the message, its destination, and the address of C. You dont have any information on the source.
What this means is you need a minimum of 2 nodes being circumvented (A and C) to have both the source and destination of the message, and even then the likelyhood if being able to do so will depend to some degree on the amount of sessions A is handling (that are routing through
Since the client controls the path, the likelyhood if getting what you want is quite small. Remember, If you don't get A,C and get say A,B or B,C then you dont have all the information you need.
I don't get it.... lets say you have:
A--->B--->C (really simple tor connection where A is the guy who's trying to hack your site, B is a middle man node and C is the exit node).
How would C even know about the existance of A? There is *nothing* in any of the data that goes from B to C to suggest that A is even the client of the request, for all C knows, B could be the client. All C does is route the response back to B then it's up to B to either consume it or pass it on. If it passes it on, it only knows the next link to pass it on to, so it doesn't know whether A is going to consume it or just pass it on.
So how do you think you can trace anyone? Maybe I'm missing something but I'd be interested to know if there is a way, so would the guys at the Tor project, I bet
The operator of site B would pull up a console and type: netstat
It would give you something that looked like this...
Active Connections
Proto Local Address Foreign Address State
TCP bridgecontrol:1805 test.datenfreihafen.de:6881 ESTABLISHED
TCP bridgecontrol:1822 d54C33485.access.telenet.be:59148 ESTABLISHED
I haven't looked that close myself as to whether or not Tor uses the same port incoming from the cloud but I'm willing to bet that's another way of narrowing down the chaos from netstat.
Anyways, your OS knows who it is talking to. By looking at this you could tell that Point A is talking to Point B (in your example) by just culling over this list.
Making sense to you now?
#16
Posted 19 December 2005 - 09:14 AM
That's useless information; let's take the typical tor connection, which looks like this:
source > A > B > C > destination
Okay, now if you managed to circumvent tor node A, you have the address of source, and you have the address of B. You don't know the address of C or destination, nor do you know the content of the message.
If you have managed to circumvent B you know the address of A and the address of C and can determine they are both tor nodes. You dont know the source or destination or content of message.
If you have managed to circumvent C you have the content of the message, its destination, and the address of C. You dont have any information on the source.
What this means is you need a minimum of 2 nodes being circumvented (A and C) to have both the source and destination of the message, and even then the likelyhood if being able to do so will depend to some degree on the amount of sessions A is handling (that are routing through.
Since the client controls the path, the likelyhood if getting what you want is quite small. Remember, If you don't get A,C and get say A,B or B,C then you dont have all the information you need.
Well, I hate to piss on your Cherrios but the question is how could I track an attacker through Tor. You're going on about reading and circumventing the encrypted circuit itself. Now, none of what I'm talking about would work if your site is so piss poor in security and riddled with holes that it looks like a 14 year old greasy kid but I'm talking about someone persistent in getting into your site (let's say they're doing HTTP POST's trying to gain access to your pr0n section).
So you got all this traffic going to your site. From the site owner's point of view they are coming from everywhere. But let's say you are the site owner and you realize it's coming from Tor. You setup a bunch of Tor servers so you add your part to the cloud and the you begin logging all your realtime TCP/IP connections (like a sniffer but without caring about the content of the message). All you want to know is where this traffic is originating from.
Now here your:
Source > A > B > C > Destination
comes to play.
Let's say you're the site owner so you know what Destination is.
And let's say you're actually picked in the cloud to be B.
So your Tor node would know traffic is going from A to C.
Sure... doesn't mean much.
Let's say the next time you're C... Again doesn't mean much.
And of course for A.... again doesn't mean much.
so you think...
Take the averages that A, B and C are used to connect to Destination.
Chances are they are going to be lower than the times that Source and Destination come up on the list as every minute or two A, B and C are going to change.
With thousands of packets flying around and depending on how big the Tor cloud is, you could be talking about a few hours or a few weeks depending on how hard they are hammering your server.
Now are you seeing it?
Hence the reason you add more Tor servers. That increases the chances you are A or C.
Edited by jfalcon, 19 December 2005 - 09:16 AM.
#17
Posted 19 December 2005 - 09:15 AM
As a practical matter, C can determine the likelyhood that B is the client or not by comparing its address to the list of tor servers. If it's not on the list, it's likely the originating client, if it is on the list then it's likely just a tor server.
This, of course, doesn't work when it's both.
I'm not sure it could even figure it out with the server list. Since Onion/Tor is a p2p network, everyone is a server and everyone is a leaf. Not that it wouldn't but you're going to have alot of false positives (or negatives).
#18
Posted 19 December 2005 - 09:25 AM
This is not true, not everyone is a server in the tor network. It's not a P2P network in that sense of the word. Tor has a way to get a server list so you can generate a random path. The server list does not include leafs (what tor calls clients), thus you can differentiate leafs from nonleafs (except in the case that the client and server are the same, as I already mentioned).
But what dictates how the server list is created? I would think that when you join up to the network and periodiocally through the session, it's pulling the server list every hour or so. Since nodes go up and down all the time, you're bound to miss servers/clients and hence the false positive/negatives. I don't think it's a very reliable way of dealing with the question.
#19
Posted 19 December 2005 - 09:31 AM
Your attack has some practicality problems, and some things can make it entirely impossible.
For example, what do you do if I choose a longer path? What do you do if I explicitly choose node A? What do you do if just by bad luck, you don't get the traffic? What do you do if the attacker runs a tor node that he always uses for hop A?
Setting up these servers makes it so you have a nonzero chance of tracking the person down, but it's still not a very good chance.
And, of course, there's the problem of practicality. Let's say you do this and it's effective, what happens if you get attacked by another network or (more likely) a chain of hacked proxies. You've now wasted a lot of time and effort for nothing.
That's useless information; let's take the typical tor connection, which looks like this:
source > A > B > C > destination
Okay, now if you managed to circumvent tor node A, you have the address of source, and you have the address of B. You don't know the address of C or destination, nor do you know the content of the message.
If you have managed to circumvent B you know the address of A and the address of C and can determine they are both tor nodes. You dont know the source or destination or content of message.
If you have managed to circumvent C you have the content of the message, its destination, and the address of C. You dont have any information on the source.
What this means is you need a minimum of 2 nodes being circumvented (A and C) to have both the source and destination of the message, and even then the likelyhood if being able to do so will depend to some degree on the amount of sessions A is handling (that are routing through.
Since the client controls the path, the likelyhood if getting what you want is quite small. Remember, If you don't get A,C and get say A,B or B,C then you dont have all the information you need.
Well, I hate to piss on your Cherrios but the question is how could I track an attacker through Tor. You're going on about reading and circumventing the encrypted circuit itself. Now, none of what I'm talking about would work if your site is so piss poor in security and riddled with holes that it looks like a 14 year old greasy kid but I'm talking about someone persistent in getting into your site (let's say they're doing HTTP POST's trying to gain access to your pr0n section).
So you got all this traffic going to your site. From the site owner's point of view they are coming from everywhere. But let's say you are the site owner and you realize it's coming from Tor. You setup a bunch of Tor servers so you add your part to the cloud and the you begin logging all your realtime TCP/IP connections (like a sniffer but without caring about the content of the message). All you want to know is where this traffic is originating from.
Now here your:
Source > A > B > C > Destination
comes to play.
Let's say you're the site owner so you know what Destination is.
And let's say you're actually picked in the cloud to be B.
So your Tor node would know traffic is going from A to C.
Sure... doesn't mean much.
Let's say the next time you're C... Again doesn't mean much.
And of course for A.... again doesn't mean much.
so you think...
Take the averages that A, B and C are used to connect to Destination.
Chances are they are going to be lower than the times that Source and Destination come up on the list as every minute or two A, B and C are going to change.
With thousands of packets flying around and depending on how big the Tor cloud is, you could be talking about a few hours or a few weeks depending on how hard they are hammering your server.
Now are you seeing it?
Hence the reason you add more Tor servers. That increases the chances you are A or C.
#20
Posted 19 December 2005 - 09:32 AM
Take the averages that A, B and C are used to connect to Destination.
Chances are they are going to be lower than the times that Source and Destination come up on the list as every minute or two A, B and C are going to change.
As I stare at my writing, I should point out that basing it on time would definately be beneficial as it would narrow the scope a bit.
I think the only reservation I have with my own argument is that what if you're sitting to a heavy talker. Then one would have to base it on the amount of traffic in and out of the tor network with a margin of variance.
This of course makes the job tougher because as the sysadmin, you still have to differentiate Tor sources and Legit traffic (guess this is where the server list would come in handy).
BinRev is hosted by the great people at Lunarpages!













