zzz
good morning orignal
orignal
back
zzz
I wanted to fill you in on some advice I've gotten
zzz
I've enlisted the help of an outside company to assist us
zzz
So far, what I've received is pretty generic and high-level
orignal
go ahead?
zzz
but it's still valuable to get outside opinion
zzz
and it helps us organize and focus our response
zzz
there's 3 categories of suggestions
zzz
all 3 we have in Java i2p already, but they need fine-tuning and improvement
zzz
but I suspect all 3 need major work on the i2pd side
orignal
tell me
zzz
here we go:
zzz
3 ways to counteract ddos attacks:
zzz
1) Rate limiting
zzz
rate limits help slow down ddos. can be done at several locations and protocols within the router
zzz
2) Traffic filtering
zzz
block by IP and router hash
zzz
avoid distributing peer info about blocked ips/routers to other routers
zzz
3) Peer discovery and verification
orignal
we have it now
orignal
say we ban unreachable routers for 2 hours
zzz
classify and validate new peers, don't spread unvalidated info around to other routers
zzz
EOT
zzz
pretty generic, but a good way to structure things
orignal
we have 2 and 3
zzz
you don't have a ban-forever IP or router hash list though, do you?
orignal
remeber for 3 I verify IP now
dr|z3d
definitely don't have rate limiting/throttling either, no?
orignal
and if doesn't match wih actual one I close connection
orignal
I ban for 2 hours
zzz
right, you do have some 2) and 3), for sure. But we both need to improve things
orignal
not ban just exclude from my netdb
orignal
ban by IP maybe
zzz
you need a permanent IP and hash ban list
orignal
I had it in NTCP many years ago
zzz
we both need to do a lot better in 3)
orignal
it's easy I can do it
orignal
about 1
orignal
well I'm not sure if it's a right thing
orignal
because you don't know the rate
zzz
all things need limits. The hard part is figuring out where to do it and how to measure
zzz
there's really two parts for us:
orignal
reember you are always limited by your network port
zzz
1a) don't get overloaded yourself. That's usually not a problem for i2pd, it's fast and computers are powerful these days
zzz
1b) don't send overload to others
orignal
but you need a difinition of "overload"
zzz
right. 1) is the hardest
zzz
1a) "overload" for yourself isn't so hard. Queue sizes or queue overflow or latency are indicators of overload
zzz
1b) not sending too much out is harder. Throttles or rate limits based on typical patterns can help
orignal
for myself yes it's easy
zzz
also there can be ramp-up detection, where you don't allow the rate of something to increase too quickly
dr|z3d
like # of transit tunnels, for example.
zzz
an example of 1b) is where we drop tunnel build requests rather than reject or accept under certain situations
zzz
so we don't propagate the traffic
zzz
that's all I got, if I get more later I will pass it along
zzz
I thought it was helpful just to organize our work a little
orignal
do they have concrete reccomendations?
orignal
about current situation
dr|z3d
I want to air an idea, an informal proposal, regarding classes of routers.
zzz
not yet, maybe later, may take a while
orignal
basically about the source of the attacks
dr|z3d
let's say we monitor different classes of routers with the same caps, eg XfR, LU etc..
zzz
I'm not sure how much they can do or when. They are busy also
orignal
can they evaluate how much power is needed
orignal
simply speaking
zzz
ideally, yes, they could do that for us. I'm hoping
dr|z3d
with a per-classes throttle using buckets, it might make sense to drop/reject requests from a given class in a given timeframe if they ramp too much.
orignal
is this attack comes from one moron or from an organztion with bujdget
zzz
right orignal I've asked those questions
orignal
for me it's most impotant question
zzz
we may have to motivate them with $$ to do it
zzz
don't know yet
orignal
"salt" guys claimed he did it to write his unversity degree paper
orignal
but we don't believe him
zzz
interesting :)
zzz
ask what university so we can file complaint :)
dr|z3d
if we only knew the university.
dr|z3d
exactly my thoughts, zzz.
dr|z3d
so, buckets for classes of router.. any thoughts?
dr|z3d
if someone's spinning up a few thousand routers on a whim using a standard profile..
orignal
you can file your compait to sportloto.i2p with the same results
orignal
have you heard that Russian hackers can't be prosecuted in Russian if they do it under name of Russia?
dr|z3d
we should be able to detect and defend against that.
orignal
it's just FYI
zzz
orignal, do you pick U peers for tunnels?
orignal
yes, for middle peer
orignal
and for OBEP
dr|z3d
U needs to die. U is for useless.
orignal
I don't pick it from IBGW
orignal
I would say U and R are useless
orignal
I don't rely on them
orignal
and rely on addresses only
dr|z3d
if a router's firewalled, it's best avoided for tunnel building. more hops, less reliability.
orignal
I disagree
dr|z3d
if a floodfill's firewalled, then it should be banned outright.
orignal
I convert it to ordinary router
dr|z3d
now is the time to start being a bit brutal with routers that aren't behaving as expected. U class routers, mammoth shit, as you'd say, orignal
orignal
U class routers are most ordianry users
orignal
and they do need transit for anonymity
dr|z3d
downgrading a firewalled floodfill and removing the f cap sounds like a good idea. zzz..
dr|z3d
they should be tolerated, but definitely not used for tunnels.
dr|z3d
that's their problem, not ours. want transit, sort out your firewall.
orignal
be realisitic
orignal
and many of the can't
orignal
because they use mobile networks
dr|z3d
this is realism. the current network attack makes for some hard choices and a lot less latitude.
zzz
orignal, dr|z3d made that change a long time ago. I resisted
zzz
he is right
zzz
I looked at part. tunnel stats for U routers
zzz
it's an average of about 1 per hour
zzz
I made the change last week
zzz
I think it's going to be a big help for build success
zzz
2nd benefit: we don't force the previous hop to lookup the RI for a crappy U router
zzz
dr|z3d, what's the benefit of buckets? I don't get it
orignal
please explain your difinition of U
zzz
U cap
orignal
because when I pcik peers I don't look at it
orignal
I only see if next peer can be reached from previus
orignal
that's all I do
zzz
yeah my recommendation is to not use U routers in any tunnels, client or expl.
orignal
this is dicrimination of U users
orignal
they need transit
zzz
I looked at the stats. U routers have almost no tunnels now. Average about 1 tunnel
orignal
lol
dr|z3d
the definition of a U cap peer is one that's using introducers.
orignal
ask _mblw_ how much transit he has ))
orignal
they have a lot
zzz
here's an example:
zzz
NU router:
zzz
stat_tunnel.participatingTunnels.60m = 1.43;5.54;82.79%;555;555;555;
zzz
that's an average of 1.43 part. tunnels over the last hour
orignal
again you can ask around people running U routers
zzz
two more NU:
zzz
stat_tunnel.participatingTunnels.60m = 9.72;42.35;137.21%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 2.47;7.10;107.67%;555;555;555;
zzz
LU examples:
zzz
stat_tunnel.participatingTunnels.60m = 0.15;0.15;270.15%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 0.00;1.04;0.00%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 0.08;0.75;23.48%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 0.54;0.62;93.23%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 2.03;5.04;45.40%;555;555;555;
zzz
stat_tunnel.participatingTunnels.60m = 1.42;4.64;72.60%;555;555;555;
zzz
most LU are under 1.0
zzz
that's why I know dr|z3d was right about this
zzz
was trying to be nice and give the U routers some cover traffic, but it's not working
zzz
and we can't be nice right now when we have big tunnel build problems
orignal
then why it works for i2pd users ?
zzz
how many tunnels do you have through U routers right now?
orignal
_mblw_ says like 30-40
orignal
on his router
orignal
*** afk ***
orignal
be back in 2 hours
zzz
ok
zzz
<zzz> dr|z3d, what's the benefit of buckets? I don't get it
dr|z3d
you detect abnormal ramping of certain classes of routers and throw them in a bucket.
dr|z3d
x new XfRs in 10m, for example.
zzz
what do you do with the bucket?
zzz
and ramping of what?
dr|z3d
fill it with any given class of routers.
dr|z3d
XfR, LU, whatever.
dr|z3d
when bucket's full, reject connections/tunnel requests from said class.
zzz
so its a leaky bucket?
dr|z3d
a bit like the connection throttler in the tunnel manager. movable, leaky, call it what you will.
zzz
so the theory is an attacker fleet would all be of the same class?
dr|z3d
yeah, if it's some sort of scripted attack.
dr|z3d
"create 2000 routers from this config"
zzz
and how do we set thresholds for each of these buckets?
zzz
I have 36 different combinations of caps here
zzz
and congestion caps will triple that
dr|z3d
forget congestion caps.
dr|z3d
we're really only interested in 3. b/w tier, ff, and reachable/unreachable.
zzz
ok so I strip congestion caps
zzz
ok so I split out each cap separately? so a XfR router goes in the X bucket and the f bucket and the R bucket?
dr|z3d
I was thinking about combos.
zzz
thats what I'm saying. I have 36 combos
dr|z3d
XfR would be one bucket, XfU another.
zzz
664 NfR
zzz
461 XfR
zzz
458 LR
zzz
355 PfR
zzz
304 XR
zzz
274 PR
zzz
127 NR
zzz
86 LU
zzz
83 OR
zzz
65 OfR
zzz
49 PU
zzz
39 XU
zzz
26 MR
zzz
17 XfU
zzz
15 NfU
zzz
12 P
zzz
9 X
zzz
9 PfU
zzz
7 OU
zzz
7 NU
zzz
6 L
zzz
3 MfR
zzz
3 LfR
zzz
3 KR
zzz
2 XOfR
zzz
2 POfR
zzz
2 Nf
zzz
2 N
zzz
1 Xf
zzz
1 POR
zzz
1 Pf
zzz
1 OfU
zzz
1 O
zzz
1 MU
zzz
1 KU
zzz
1 K
dr|z3d
PO/XO = X.
dr|z3d
buckets could be dynamic, too. until there's x percent of a given combo, don't bother creating a bucket.
dr|z3d
K we don't care about, not enough of those around.
dr|z3d
and they don't do transit in any event.
dr|z3d
if they happen to get to be a problem (unlikely, but possible), we just ban then as a class outright.
dr|z3d
routers without a U/R cap can go into one bucket perhaps?
dr|z3d
so we're whittling down the number of possible combos.
zzz
precise number of buckets doesn't matter. It's several
zzz
for each bucket do we have to set the limit manually? or they all have the same limit? or it's automatically adjusted?
dr|z3d
ok
dr|z3d
automatically adjusted based on the size of known peers in netdb, no?
dr|z3d
with perhaps different weights for different combos, slower routers get smaller buckets.
zzz
and we'd have these checks at (at least) 3 points? NTCP inbound, SSU inbound, and tunnel build requests?
weko
[14:11:57] <zzz> I looked at the stats. U routers have almost no tunnels now. Average about 1 tunnel
weko
For my very low uptime firewalled router I transit 1/3-1/2 of all traffic. I think not bad. In tunnel count, about 5-15 on average
dr|z3d
you're the low level guy, zzz, whatever you think will a) work best b) not slow down the router and c) not hit the ram too hard.
zzz
ok I think I know enough to respond:
zzz
overly complex, very difficult to tune/tweak/test, and overly focused on the attack du jour
zzz
however I am going to look at our overall inbound conn throttles in ntcp and ssu
zzz
I think they need tightening
zzz
because I see routers getting a thousand i/b conns in two minutes after startup
zzz
I thought we had throttles in both places
zzz
we don't need buckets there
zzz
also for IB conns you don't have caps until the handshake is complete, that would be the wrong place to throttle
zzz
thanks weko
weko
Introducers really helpful
dr|z3d
ok
dr|z3d
if we forget about f and R/U and just focus on bandwidth tiers, still overcomplex, zzz?
orignal
I would rather split by combiation of addresses
orignal
guys, let's back to LU issue
orignal
what's your problem?
orignal
low tunnel creation rate or what?
orignal
if we exclude LU from transit, then once I see a LU router next it tunnel it means it's owner of IB tunnel
orignal
profit !!!
zzz
hmm
zzz
forgot about that
zzz
but I think it's already that way because success rate is so low
orignal
I have 50-60% now
orignal
do you think it's low?
weko
30%. No low also.
weko
Not*
weko
We need more investigation before do something. More discuss, anyway. I suggested attack model, why I don't right? You thinking what you right, but please see other side of your changes
zzz
maybe for exploratory only
zzz
definitely not client
zzz
and maybe only 10% of the time or something
zzz
we're talking exploratory, they wouldn't be in client tunnels anyway
zzz
you need to know your exploratory build success rate, not combined
zzz
thats what I'm doing weko
weko
We need more hard network changes for general fix
orignal
but why do you need to care about exploratory rate?
zzz
because that's what will improve a lot if we skip U
dr|z3d
60-80% here :)
orignal
then skip it for exploratory only ))
zzz
U isn't in client tunnels anyway, most of the time
orignal
but remember that the main goal of I2P is anonymit, rather than rate
zzz
sure, but it has to work
weko
I have some ideas about "many IPs" protection also, about Sybil protection, about improve "good" user experience... Please see us and me!
orignal
please exaplain what's wrong with U in client tunnel?
orignal
I would say opposite
orignal
U is usually on home PC
dr|z3d
re buckets, here's an even simpler proposal. 1 bucket for U peers.
orignal
e.g. powerful box
weko
As I writed in i2pd's FAQ "all TCSR >10% is good"
zzz
in practice, even if we allow U in client tunnel, they aren't in fast profile tier
orignal
R is usually on VPS witj limited resources
zzz
it just doesn't happen very often
orignal
think about this idea
zzz
there's very few MU/NU/OU/PU/XU. They're all slow LU
weko
Maybe not
weko
My PC maybe powerful and 24/7 uptime
orignal
ofc I exlude LU from client tunnels because L
orignal
not because U
weko
But I U
weko
We should not profile by R/U flags, it just for mention, need we introducers or no
weko
Need we ask introducers*
weko
Also I think we need general and shared paper/proposal/recommendation about profiling
weko
With anonymity index (user can choose)
weko
Anonymity index mean how many peers we choose for our tunnels
weko
But this index affect quality and speed ofc
orignal
zzz, regrardless this problem I see another issue with SSU2
orignal
say I have an SSU2 session and thier RouterInfo was good in SessionConfirmed
orignal
later this RouterInfo was expired in my netdb, but peer didn't send an updated one yet
orignal
but sent a PeerTest
zzz
so ask him for it
orignal
how?
zzz
send him a database lookup message :)
orignal
so I can't send Alice's RouterInfo to Charlie because it's not in my netdb anymore
orignal
how do you resolve this siatuation?
orignal
lookup? what if he is not a floodfill?
zzz
he should always respond to a lookup of his own RI
orignal
hmm something new
weko
Need do many "no backward compatibility" things now, for enable "no backward compatibility" part when we will be ready for this
orignal
I thought non-FF shoudl always reply with closests FFs
orignal
but how do you resolve it?
zzz
this is exception
orignal
why?
orignal
it's not rare
zzz
because it's direct, not through tunnel
zzz
here's our "direct lookup" code:
zzz
DatabaseLookupMessage dlm = new DatabaseLookupMessage(ctx, true);
zzz
dlm.setFrom(ctx.routerHash());
orignal
you mean for lookuo
zzz
long exp = ctx.clock().now() + 5*1000;
zzz
dlm.setMessageExpiration(exp);
zzz
dlm.setSearchKey(_key);
zzz
dlm.setSearchType(DatabaseLookupMessage.Type.RI);
zzz
OutNetMessage m = new OutNetMessage(ctx, dlm, exp,
orignal
fine
zzz
OutNetMessage.PRIORITY_MY_NETDB_LOOKUP, _oldRI);
zzz
ctx.commSystem().processMessage(m);
orignal
I'm asking about SSU2
zzz
doesn't matter, works the same SSU2 or NTCP2
orignal
do you even handle this situation?
zzz
yes of course
orignal
NTCP2?
zzz
this is higher layer, at netdb layer
zzz
transport doesn't matter
orignal
you don't have peer test there
orignal
and you basically don't need your peer's RI
zzz
if you are going to lookup an RI and you are connected to the peer, just ask him
orignal
no, i"m asking about peer test situation
orignal
not lookup
orignal
so that's what you are doing?
zzz
no we don't do lookups for peer test
orignal
how do you resolve it?
orignal
If Alice's RI is not in your netdb anymore
zzz
who am I again? Bob?
orignal
I'm Bob, Alice sends peer test
zzz
looking...
orignal
through SSU2 session etsblished long time ago
zzz
if (aliceRI == null) {
zzz
if (_log.shouldLog(Log.WARN))
zzz
_log.warn("No alice RI");
zzz
// send reject
zzz
sendRejectToAlice(SSU2Util.TEST_REJECT_BOB_UNSPEC, data, fromPeer);
zzz
we just send unspecified failure code
orignal
what is "unspecified code"?
zzz
I guess we could ask Alice but we don't
zzz
alice could send RI with peer test if session is up a long time
zzz
but don't we send RI periodically now? I forget
zzz
SSU2Util.java: public static final int TEST_REJECT_BOB_UNSPEC = 1;
orignal
code 1?
zzz
yes
orignal
we send RI periodeically
orignal
but it might be in between
orignal
old expired new not sent
zzz
shouldn't be
zzz
we send RI every 29 minutes, and our shortest expiration is 60 minutes
orignal
you create session with 5 mintes before expiration
orignal
and send peer request after 15 minutes
zzz
true
zzz
but if peer is connected, we never expire without asking him for it first
zzz
that's where we use the "direct lookup" code
orignal
what do you mean?
zzz
in our RI expiration code we have "lookup before dropping"
orignal
so you remove a router from netdb only if it's not connected?
zzz
try to get a new one, don't just delete it
zzz
no
zzz
first we ask him for a new one
orignal
then explain
zzz
if that fails then we delete it
orignal
but it's netdb
orignal
independent thing
orignal
I go through netdb compare timestamps and delete expired
zzz
right but you can send lookup messages directly to connected routers, even if not floodfill
orignal
when?
zzz
we do that, but then we 'lookup before deleting'
zzz
any time
orignal
it means I have to check in netdb
orignal
if router is connected
zzz
right, that's what we do
orignal
overhead
orignal
but fine
zzz
up to you
orignal
will try to do something
zzz
protected void lookupBeforeDropping(Hash peer, RouterInfo info) {
zzz
if (_context.commSystem().isEstablished(peer)) {
zzz
// see DirectLookupJob
zzz
...
orignal
Blinded message
zzz
yeah we check the hash->connection maps in both transports
zzz
so two quick hashtable lookups
orignal
but for every router
zzz
sure but expiration is a background job
zzz
different thread
orignal
but if you have one cpu no difference
zzz
true ))
orignal
I have better idea
orignal
one you receive RI from SessionConfirmed bump it's timestamp
zzz
yeah but that breaks the signature if you're ff and get asked for it
orignal
I don't extract timestamp from buffer
orignal
I store it seperately
zzz
ok
orignal
let me think
zzz
we do about a bazillion hashtable lookups a second, if they are slow we're in trouble anyway ))
orignal
but don
orignal
't you think
orignal
that you are slow because of this ))
orignal
for me every extra lookup slows down
zzz
how often do you run the expiration code?
zzz
most of the hashtable lookups are for configuration things: if (x >= _context.getProperty("max_xxxx")) ...
weko
orignal: i2pd also have many hashtable lookups (as you said), but I guess it less count in many times))
orignal
every minute
zzz
orignal: too fast. Do RIs every 5 minutes. LS every minute.
orignal
will do
orignal
I save and cleanup at the time
zzz
now you have the time to do connection check :)
orignal
no I don't
orignal
a router will be more busy ))
zzz
ha
zzz
we only save RIs to disk every 10 minutes
orignal
even after start?
zzz
yeah
orignal
someone starts router then stop it after a minute
zzz
so?
orignal
and reseed again next time
zzz
reseed writes directly to disk for us
orignal
goof point
weko
orignal: you did same, I am right?
orignal
no
orignal
I do it every minute
weko
Write on disk?
weko
I think we can do this configurable
weko
And set default to 5 or 10 mins
orignal
zzz also your opinion about garlic inside garlic. shoukd it be allowed or not?
zzz
looking...
orignal
I drop it for now
orignal
but I think it's valid sitiation'
zzz
what are the delivery instructions?
orignal
local
zzz
yes we allow it on receive. I'm not sure what the situation is when we send it though
orignal
and handle?
zzz
yes we handle it for LOCAL only. For DESTINATION we drop
orignal
thanks. will implement it
zzz
one case I know
zzz
we send garlic'ed Delivery Status Messages with the LS to be returned back to us as an ack
zzz
to hide the delivery status from the OBEP/IBGW
zzz
afk, back in an hour
orignal
no, it goes to destination not to router
dr|z3d
orignal: if you're tweaking the way you save RIs, maybe consider not writing the crap to disk and keeping it session/memory only.
orignal
too much memory
dr|z3d
you don't store all your RIs in ram?
dr|z3d
if you're reading from disk, then the alternative is to make L,M,N,U expire after 1hr and get deleted. keeps things from getting messy.
zzz
the only thing we allow with delivery type DESTINATION is I2NP Data Message
zzz
we would log anything else as an error
zzz
never seen that error before
zzz
if that was happening everybody would report it
orignal
but it happened
orignal
ofc you didn't see in the logs because it was local
orignal
for router
zzz
oh, I see, you were talking about Delivery Status Message. Never mind.
orignal
I'm talking in general
orignal
it was garlic in garlic sent to router
orignal
either somebody's bug or attack
zzz
tell us what's in it when you decrypt it
orignal
I don't know
orignal
I change the code to drop it
orignal
because I had deadlock
orignal
because recursion with mutex
orignal
I need to redo this logic
orignal
ppocess it instead drop
orignal
and we have made the new release fixing this critical issue
zzz
ok
zzz
gosh DtQs is still at it. 7 inbound attempts in 9 minutes on 7 different IPs
zzz
it's worth implementing a banlist for that one router alone
obscuratus
Yeah, he's got a heck of a botnet (or something) going.
obscuratus
They can't be connecting to that many Java I2P routers, they were banned by the newsfeed last month.
zzz
i2pd + android _ bigly is still a lot
zzz
i2pd + android + bigly
obscuratus
I wonder how I2PD handles DtQs. Surely they can't connect to multiple DtQs at the same time.
zzz
none of them have news feed blocking
zzz
I would assume he would stop at one conn but who knows
obscuratus
Bigly uses our lib without the newsfeed, How would we handle that if they weren't on the banlist.
zzz
in ssu we drop the old one and switch to the new one
zzz
not sure about ntcp
obscuratus
It's got me scratching my head what they're trying to accomplish.
obscuratus
I did a reverse DNS on a few of the DtQs IPs last week. They were all over the place, but included Cox, Comcast, and other typical providers of home service.
zzz
I'm working on implementing an inbound throttle
zzz
which is a headscratcher too