RIPE 89

Daily Archives

11 o'clock
.

.

.

.

.
RIPE 89
29 October 2024
Plenary session
Main hall
11 a.m.

KEVIN MEYNELL: Okay, well welcome back from the coffee break for the start of our second Plenary session today.

So, we have three talks for you. So, first up we have Ingmar from BENOCS, so he received his Ph.D. from the Technical University of Berlin in 2013, and he is currently the CTO and founder of BENOCS which is working on light scale traffic analysis, CDN content steering and network optimisations, so I would like to welcome Ingmar to the stage for the first talk.

INGMAR POESE: Hi, I'm going to talk about the, our paper on ingress point detection, which is, which was, which appeared in sit come this year, and is actually something that we have been running for a very, very long time. This goes back to 2016, when we wrote the first prototype in a bit of a hurry because we had a problem that we had to solve.

So, you all know this, the Cloud, right, with the ISP and you should be familiar with that. Usually there is cities and ISPs and then you have your different end customers and some of them have one type of a connection and the other one has the other type of a connection and we're going to focus on the yellow ones here at the moment, because the CDN that we have connected at the moment, this is connected directly, but they are not, not only connected directly.

They want to deliver content and how do they deliver content? They deliver it from somewhere, as an ISP you endpoint know were where, they have their mapping algorithm, you can talk to them about T but in the end, they are delivering it somewhere and somehow the yellow households are slow. They are coming from the same city but the yellow ones are just slow, they just don't work right.

So, after a lot of debugging and a lot of digging through data, we found that this is actually an incorrect mapping of that specific prefix that the customers were at, and it is actually easy to find out what the source address is, but it's very hard to find out where that source address enters your network, which means we need some ways, we were looking for an easy way to solve this problem.

The second point is if you do have the CDN, you might want to do some kind of traffic steering, meaning that you want to be able to automatically tell a CDN or a hyper scaler or something, hey, if you are delivering towards this region, or this prefix in my network, could you please use this area or this cluster or something that comes in over there, or over here.
For that we we are actually building topological maps for ISPs, and in order to build those you need the ingress point because you need the path through the network. But IP doesn't really give you that information.

So we needed to build them.
As in the picture, you can see CDN, it could potentially deliver from anywhere, they do a really good job at trying to make it as efficient as possible but it doesn't always work and having the debugging capabilities here in your hand is something that can help quite well.

So, what we're really after here is given a source IP address, on which link does it enter the network? And we're mostly interested in the heavy traffic shifters or the premium services.

So, with that goal in mind, we're going to through a couple of design challenges, and things that we had to overcome in order to do this.

First of all, if you want to make a statement about what's happening on your border routers you need to measure on all of them. That's not something that is optional. You really need to look at everything. Now if you look at this picture carefully you can also see that we're only measuring the links that come from other, outside of the network, we're not measuring the ones inside the network, so technically, you would also need to look at, for example, your upstream links from your subscribers, those ones have a dedicated point that you know about. So there is no, you don't really need to measure those: You only need to measure on the links where you get traffic from the outside where you don't have control of the source. We need to do this on all border routers which could be thousands. You need to look at the traffic and at the source IP addresses, control plane is not going to help you a lot here, which also means that when you do this, there is going to be an of a avalanche of stuff rolling towards you. Mostly NetFlow, IPFix, sFLow, whatever they are all called, but it's basically and avalanche of data that's going to come towards you.

And the big question here at this point is: At which granularity are you going to track this? You can track every individual IP address and you might get away with that on an IPv4 space, as soon as we enter the IPv6 space, you have lost, there is not enough RAM that you can build into a computer that you can track every IPv6 address. So we need to aggregate somehow, we need to make this efficient.

And then there is also the thing that even though announcements on the Internet, /24s or /48s at maximum for in BGP that doesn't necessarily mean that's the partitioning that's being used to send out the traffic. So the clusters that are being used, or the subnets that are being used as sources they can be scattered in a much fine grainer scenario, and you won't actually see that anywhere. So, to illustrate that here is we have in the CDN, we have two /28s and one /25, which you, which could all be in the same /24 being advertised on the all of those linking but each of those could come in through a different link. How do you deal with that situation where we see one sub‑net but the same sub‑net is coming in over three different links, and it can really come from anywhere.

So, we needed to look at that as well.

And I don't know how much head wind I'm going to get for this here. In research, there was a lot. BGP announcements go from you, so your end customers go from you towards the CDN but the traffic flows the other way. The traffic goes this way. So, we took special care of not using BGP for this at all because BGP announcements that simply going the wrong way, they are not going to help you when you are looking at where traffic ingresses into your network.

This is, for me this is an important point that do not trust BGP when looking at the source of something. When you are looking at the destination, fine, all good. But do not trust it when you are looking at the source of it.

Something we then started writing the algorithm, something that we found fairly quickly is that the ingress points actually change more often than we thought. So, 60% of the ingress points that we're detecting are actually only stable for under an hour, which means we need an algorithm that adapts fast. Something that can do the online analysis that adapts fast and that goes ‑‑ that can notify or give us basically online almost online results to this.

So, last point for the design aspects is, we were mostly interested in CDN traffic steering and finding mappings and building maps, so this algorithm focuses on heavy traffic links which are easier to deal with, because you get a lot of samples and you can do better statistics on them. It's much harder to do this when you are low traffic.
But, point being, the most of the traffic that we see has, at a given prefix sites, which we don't know, has an single ingress point. So again, in the example if you look at this here you are four potential paths that you can see but you have a very good chance that only one of those paths is used and all the other see none of the traffic at all.

With the design aspects out of the way, let's go into the actual implementation for this.

First of all, what do we need? How are we going to build this ingress point detection? I have already said the point of we're not using BGP as an input, and we had to stress that point here of not using BGP. It's actually the only thing saying that we're not going to do.

As an input, we only use flow protocols. I use NetFlow here as a place holder for any flow protocol that is out there. And we also need a link classification.

It sounds a bit silly, but if you want to do something like that, you need to know that ‑‑ you can only use the samples that come in on links that are actually coming from another AS. You cannot use any of your internal links, the double counting will break the statistical algorithm. So a link classification can be either ‑‑ I make sure that I only sample on exactly those links. That's a valid scenario. Mostly in operations I have seen this not working, so it's much better to have some kind of a query or some kind of a database, a connection that says measure this, this, this and this link, these are the ones I want to go for and you basically filter down your flow protocol before you do any analysis to only really make sure that you have those links.

We need to scale to thousands of routers, we need to do this continuously. We need to aggregate IPs in a dynamic way where we don't know the prefix sizes. We need to detect those as we go along. And we want to track all the source addresses. We want to aggregate them, and we want to focus on the high volume prefixes and we want to find the dominant ingress points.

With that, let's go to the actual algorithm.

Two stage algorithm. Or rather, two stage process here.

Stage one. Deal with the NetFlow. Like I said, this is an avalanche of things, and the first thing that you need to do is make sure you don't drop them. Make sure you get all the NetFlow in and just sort them into a data structure where you can hold them for a little while.

Then, in regular intervals you go up to stage two. Everything you have collected, insert it into a structure, and do the calculation on where your ingress points are.

As you can see here, the NetFlow requires tonnes and tonnes of CPUs. We have run this with over 40 CPUs just to deal with the NetFlowr. The actual detection of the ingress points runs within seconds and you don't really need much. It's CPU maybe 2 and that's it. There is also one thing on here I want to point out and that's says classify ranges. The dynamic prefixes I have been talking about we're calling them ranges internally which means a range of IP addresses we assume to come in on the same link.

With that, I'll give a short overview on how the algorithm actually works so what you are seeing here is on the right side, is the network where the ‑‑ okay, that's just how it is I guess.

So, what you see on the right side is the network, and the individual packets as they are coming in. What you see on the left side is the entire IPv4 address space flat out as a 32‑bit integer, and just where the source IP address would locate on the IP address space. At the start, so, time stamp zero, we know nothing. So EU which means we say the entire Internet is coming in on one link. That's our underlying assumption. Everything comes in on the same link. And then, we sort things in based on which link things came in on. That's the colours. And as you can see, we have a very colourful, after a run time we have very colourful way of describing things. I forgot to say, on the right side of the address space, that's the minimum number of samples that is needed to actually take a decision. This is an example, so, in real life those numbers are much much bigger. But, in the end, before we reach at least 11 samples in this, we are not going to decide anything. If I haven't counted ‑‑ I think we ‑‑ I lost my thought.

Once we have collected all of that, we look at the /0 space, so at the entire space and look if there is anything that's dominant in there. There isn't. It's colourful. So split it in the middle into two /1s, and since we now only two half the amount of samples, we require a little bit less of samples in each of the blocks, and we look at it again.

Now, each of these blocks don't have enough samples which means at this point we can't take any more decisions, so, we keep running. We keep running the algorithm, we keep sorting new things in, and we find more samples at this point. We have, on the /1 on the left side, we have enough samples so we split that one down again. This time, it looks like the very left one, the very left /2 can actually ‑‑ could actually be an ingress point. But it doesn't have enough samples yet. It still needs 5, so we can't classify it yet. Whereas the right /2, that's still just colourful.

And the right /1 doesn't have enough samples, not much we can do about that, this continues.

At time stamp 2 we now have enough samples and we can classify that one, the left /2, into an ingress point and say: If we see ‑‑ if we have a source IP address that is within that /2 range, it is likely to come in on the link that is marked as purple. And that's basically how it goes. Like I said this is an example. In real life you would never classify a /2. This goes down way further but as an example this is how it goes.

Also the left /2now has enough samples which means this is going to be split again, and at some point samples are going to drop, which means since the blue sample dropped, this could be classified as well.

Now there is more to this algorithm that we do, it's all explained in the paper. But that's the general idea. Split it down until you reach something where you can make a statement about it, and if for some reason something changes and you get one of the ranges polluted to you get a different colour in there, you restart from scratch and basically do it again.

Also, if you have two ranges, for example, next to each other that have the same colour, so coming through the same link, we aggregate upwards again. So we always try to have the biggest range possible in there, which means less management and also means less state that needs to be kept.

And the last point is, there is also a maximum split where we're stopping. So in this case here we're stopping at /3, which means we won't actually split any further at this point. In real life, we have tested this with anything for v4 between /20 and /32 I think.

But, we actually tested this with a lot of different configurations.

So, in the were deduction setting that we have, that we deployed in 2016 or '17 or something like that, we're using these settings. At the start this was only v4, but this got extended to also use v6 since the same algorithm completely applies in a couple of years later.

So for v4, we use a /28 as the maximum split. We say anything under /28 is so small we don't want to track it any more. We have an error margin in there, which means if you have 5% mismatches, we don't care. We are looking at the high volume. We want the heavy hitters. We don't want the very small ones, that just makes things harder.

And we run the stage 2 every 60 /EBGDZ is, which means every 60 seconds we look at what we have gotten in and reclassify things.

And in the paper, we then do the parameter study on what effects that has, which you are welcome to read in the paper, I'm not going to go into it in this talk.

This entire thing, this entire algorithm you runs on one commodity server. So this is an off‑the‑shelf server, I think it has 48 course, 256 gigabytes of RAM plugged in and it's been running on that for the past six years. I don't think it ever got a hardware upgrade. It is still running and it is still working.

So, then at this point, let's look at some results.

Classification ratio:
The orange line is the classification ‑‑ the orange line is how much was actually, how much traffic was classified at what buoyant point in time. We used NetFlow and the classification of the ingress point detection at the same time to verify the results. The grey in the background it the relative traffic. So that's your usual pattern that you'll see in pretty much any residential network. Not much happening during the night time, and 6 to 8pm being the peak time.

And you can see that even in the low times, when we only receive 20% of the traffic classification doesn't actually drop that much. So we never drop under 88% of classification for the traffic. Which means that we can classify almost all of the traffic on to links that comes in.

Going to look at this a little more differentiated. So, as I kept repeating, we are mainly interested in the heavy hitters. So this is now split up into all. The one that you saw before, that's still the orange line in the bottom. But this one now also gives the hit rate for the top 20 traffic shifters, so just top 20 AS that is push traffic in, as well as the top five. And as you go to those heavy shifters, your hit rate actually increases. They are more stable, they are better managed, they actually do a pretty good job. So, with the top five of traffic shifters which is usually around 50 to 60% of your entire traffic you are already hitting rates of 97, 98% of actually hitting the correct ingress point. And this means that if I want to know where one of the hyper giants is actually pushing in a certain prefix into my network, all I have to do is ask a database that has maybe a hundred thousand entries every 60 seconds. I can look at it and look at it over time and it's just at my fingertips at that point.

We'll never reach the 100%, this is Internet traffic, which means it's messy. There is all sorts of things in there. That's why we're not reaching the 100%.

I want to get back to BGP. And why we're not using it for this.

This is now over the entire time. So from 2016, mid‑2016 to the end of '22, '21, where we looked at the data. The path is symmetries that we detected is using the ingress point between what BGP would have told us where the traffic comes in versus where the traffic actually entered the network. And if you look at this, if you look at your overall traffic, we have a hit rate of a little bit of over 50%, which means 50% of the path that you would have gleaned from looking at BGP would have not been correct according to the traffic that was actually entering your network. This gets better as you're going up, let's call it the food chain, so the bigger the player the more stable and the more engineered they are. But even if you are only looking at the top five, you have a hit rate of 85% of path symmetry. Which means 15% of the traffic is going to be just wrong and we're talking about the top five here, that's a lot.

The last point I want to make in the analysis here is that BGP, if you classify through BGP, you stop at /24, which means you can't really see any granularity below that. If we classify through BGP, you are at the grey bar and over 50% is going to classified as a /24. If you are running the ingress point detection, like I said we're running this on a /28, you can see that the classification actually continues downwards and there is lots of diversity in this.

With this, I think I have to speed up a little bit.

Operational experience. We used this for network troubleshooting. We used it for CDN ISP collaboration, we used it nor network debugging and watching ingress points. We used this for comparing what BGP does or what BGP says compared to what the traffic actually does. And the one thing that we haven't solved yet is the router level traffic load balancing. I'll omit that too. If anyone is interested in those problems, come talk to me.
Last thing, we have a prototype for this. So if anyone wants to actually test this out, this was done by the University of Castle by Stefan who was the first author of this. This is completely free. Anyone can use it, download it, don't ask me any questions about it, he did that, I can refer you on, but this is a completely, complete setup including the entire algorithm as we describe it in the paper.

And with that, I am at the end. Thank you and I'll take questions now.

(Applause)

FRANZISKA LICHTBLAU: Any questions?

AUDIENCE SPEAKER: Steve Wallace, Internet 2. Do you use BGP to effect what ingress point they choose or do you use some other tools if you need to influence which point they are using?

INGMAR POESE: We're looking ‑‑ so in the comparisons, we have looked at where the BGP announcements for that source sub‑net would have come in, and where we potentially had seen them and then compare that to where we actually saw the traffic come in.

AUDIENCE SPEAKER: But you tried to change which ingress they are using, the CDN providers, do you do things to actively try to manage what they are doing, which path they select to come into your network?

INGMAR POESE: I am having trouble understanding you.

AUDIENCE SPEAKER: I'll talk to you offline.

INGMAR POESE: Sorry...

FRANZISKA LICHTBLAU: Maybe you can do that offline. Something, I was wondering about, I'm not sure if I got it correct in the beginning you need to do some work to do like link detection in order to avoid the double counting. How much effort is that if you would redo this for another setup?

INGMAR POESE: It really depends on how sophisticated the ISP running this is. So, one of the things that we, for example, do is use SNMP or telemetry to classify links and use those. Other IXPs already have this and OSS or BSS systems and can supply that to us. Some others say yeah we're only sample on those links, and that's sometimes true, but yeah, actually it's usually not a huge effort to do this, but it's something that needs to be done. Otherwise the entire statistical analysis just goes off.

FRANZISKA LICHTBLAU: Okay. Thank you. Do we have... no, we don't have online participants. We don't have anyone at the microphone. Then thank you very much.

(Applause)

FRANZISKA LICHTBLAU: So, our next speaker, and I really hope he is here, it's Remi Hendriks from the Twente. He is a Ph.D. student at the university, and his research focuses on Internet resiliency with a special focus on Anycast which is what he is presenting on right now. And you may be interested, this is actually part of the RIPE NCC Community Projects Fund.

REMI HENDRIKS: Thank you. I am from the University of Twente, and today I will be talking about the project we data networks which is called Anycast discovery: Daily mapping of Anycast landscape.

This is our three team. It's me and my supervisors.

So, first I will talk about the Anycast consensus, I will explain why it's important. Go over the methodology, how we realise it, I will share some results. And then I will talk about the Anycast measurement tool that we developed to realise this consensus, and finally we will conclude and there will be time for questions.

So, let's explain the consensus.

So, what does Anycast usually when you address, it's located in a single location, it's announced at one point and it's Anycast. Anycast is it's where you it's announced for multiple locations and this allows for geographical distribution of an Internet service. And it's widely used as it provides resilience, low latency and load distribution.

So give an example. This is the Anycast network from CloudFlare and if you connect to the quad 1 address, the public DNS resolver they have, you will connect to the one in Prague, since you are here in Prague, but if you sent a probe to this address from New Zealand, you will receive a response from their site in Aukland.

This project was funded by the RIPE NCC community project fund, and we used this money for deployment and infrastructure costs, and also other research costs. And to realise the consensus, we established a daily measurement pipeline, and with this pipeline, we provide reliable daily consensus that's publicly available.

So, why do you care about an Anycast consensus? Anycast is one of the most effective distribution and resilience techniques on the Internet. It's used for critical Internet structures. And it's largey used by CDNs to provide low latency and reliability of their services. And lastly, it allows for mitigation against DDoS attacks and it's used to provide DDoS protection services.

But the problem with Anycast is that it's opaque. So if you have an address, you don't know if it's Unicast or Anycast, and also if you have a service, you have to know if it's provided using Anycast. You also don't know where the Anycast sites are located behind such a prefix.

So, why do operators care? Well, knowing this and what is not Anycast is useful for making better traffic engineering choices. It allows you to troubleshoot network problems, Anycast has some weird behaviour, and it's always the corner case that can really show weird behaviour in your analysis. And it also allows you to do resilience assessment of third parties.

So, how do we realise the consensus?
We make use of two methodologies, an Anycast based methodology and latency based methodology. We do these these multiple protocols. This is to achieve maximum responsiveness. The Anycast based approach is based on Anycast squared, and it uses Anycast to measure any Anycast. The latency based approach is based on ‑‑ I will explain all of these methodologies. Starting with the Anycast based one.

So as I mentioned, we measure Anycast using Anycast. And this light blue dots are our Anycast infrastructure from which we measure.

So when we probe a responsive target on the Internet, in most cases it be Anycast then. This is what would happen. This Anycast target in this example it's in observing lay home a city. It will receive four probes from all of our Anycast sites, and for each of these probes, it receives, it will send back a probe response. But since the target for the probe response is Anycast, it will route to the nearest Anycast site, and in this example it's our site in Nashville, BNA. So we see that will receive four probe replies from the Unicast target. Then if we probe another address that is also Anycast, in this example we probe an Anycast address that is announced in Las Vegas, on the west coast, and Atlanta on the east coast and then we see that our Anycast sites on the west coast reach their site in Las Vegas and our site in the east coast, they reach their site in Atlanta, and each of these sites route back to a different Anycast of our Anycast site because it's the nearest one. And that is how this methodology works. If multiple of our sites receive responses we prefer Anycast. But if one site receives all responses then we prefer Unicast.

The pros of this approach is that it is really low probing cost which makes is feasible for scanning at Internet scale and it has a low false negative rate which means it will rarely misclassify Anycast as Unicast. But the down sides are that it has a considerable false positive rate, meaning it will falsely classify Unicast as Anycast, and this happens when a Unicast target is in between two of our sites, and then we see that our replies from the Unicast target will end up at multiple locations due to load‑balancers or route flips.

And it also does not allow for geolocation. So you can detect Anycast and you can provide a numeration of the amount of sites behind the prefix, but you cannot geolocate the individual locations.

So that's why you have to second technique, latency based approach and this uses the great circle dance. This is just like vehicular IP geolocation, what it does is it sends probes using the Unicast address and it measures the latency of the packet and then when it gets the response.

And then you draw a circle around your node from which you measure and you look at the maximum distance that the packet could have travelled in that time. And then this approach looks at speed of light violation. So in this example, we have four latency measurements, but the circles do not all overlap. So there is a speed of light violation which means that the probed address has nobody Anycast. And this looks at the overlap of these circles, there is two overlaps which means there has to be at least two sites behind the Anycast prefix and it the first location they are looking at the city with the largest population in this overlap. And this is considered the current state of the earth. And it started at a low false positive and low false negative rate. It's highly accurate and it also allows for geolocation. But the down sides are, it requires really large measurement platform for accurate measuring. So for example, you need a RIPE Atlas or CAIDA ARC, because you need vantage points. It has a high probing cost. It has a high burden on the measurement platform on which you do the measurement but also on the Internet as a whole and it makes it unsuitable for measuring at Internet scale.

So, to give an example. This is measurements, latency measurements using RIPE Atlas. And the green dots means low latency, the red dots means high latency. So on the left you see if you probe Unicast target from multiple locations, you will see that nearly Unicast target, which is in this case is the west coast, you see there is low latency responses there, but as you move fart away you see the latency increases. If you probe an Anycast prefix, you see that the latency is low in most places of the world in multiple continents.

What we do for four consensus is we combine both approaches. First we preferred the Anycast based approach and we do this towards the entire 2004 space, which has millions of addresses that we probe. And this generates a set of Anycast targets which includes the through positives and false positives from the Anycast based approach and these are only tens of thousands. And for the tens of thousands we repeat ‑‑ we do the latency based measurement, and this gives us an output Anycast prefix that is I found using either methodology, the numeration that the methodologies find, and the geolocations of the latency based approach.

So, for our hit list, we scanned the threshold for granularity for IPv4, /48 Tor the IPv6 and since the smallest routable BGP prefix size. For IPv4, we used the USC/ISI hit list which has the most likely responsive for each /24 and for IPv6 we used a combination of the Munich, public IPv6 hit list and we also used AAAA record address that is we sourced from open Intel.

So, to visualise its pipeline we have the hit list with millions of addresses, do the Anycast based approach with ICMP, TCP and UDP, this has 32 vantage points. This generates the Anycast targets, which are only tens of thousands and then we do the latency based approach with ICMP, TCP and for this we used CAIDA Ark platform in combination with our own testbed and then we get 180 vantage points for IPv4. And the final result that we uploaded to the consensus is for all the protocols from either methodology where it detects it to be Anycast.

So, to realise the consensus, we created an Anycast based measurement tool. I will talk more on this later. We deployed using vulture which is present in 19 countries and six continents and it provide good coverage in the global north but in the global south it's still an open problem.

The latency based measurement we do with CAIDA's Ark platform and our vulture vantage points and we have 180 of them. We implemented this using Scamper and we found that it provides accurate geolocation and enumeration for small Anycast deployments. The downside is that we cannot differentiate between site with near geographic proximity. So for example, the Anycast site in Amsterdam and Rotterdam, they are close together and our approach will not be able to differentiate between them because latency data is not always reliable.

Furthermore, for large deployments, so think of CloudFlare that I showed before which has more than 300 Anycast sites we were only able to numerate up to 60. It's not feasible to numerate all of them.

So, our consensus providers combined view because neither methodology is perfect. The Anycast ‑‑ the latency based approach there is rare cases of false negatives. So if you have an Anycast deployment in Belgium and the Netherlands, it has difficulty detect it go because it's such small area the Anycast based approach is able to detect most cases through topological differents but it suffers from false positives. So, if you use the consensus, the criteria are up to you, we provide both results, if you filter on both you'll have some false negatives. If you filter on either you will have to deal with some false positives. It's depending on your requirements.

So, let's hear some results.

On any given day or consensus finds approximately 12.3,000 /24s that are Anycasted. They originate from 769 autonomous systems. We find 4,000 /48s that are Anycasted, originating from 462 ASes. For IPv6 our coverage is limited by our hit list because we cannot scan the entire IPv6 space and we find that 300 ASes, Anycast both IPv4 and IPv6, but we suspected this number is higher, it's just that we do not have a responsive IPv6 address for some of the Anycast in IPv6.

So, these are the largest organisations that deploy Anycast. We see they are mostly Cloud providers. These are the largest IPv4 autonomous systems that deploy Anycast and we see that Google Cloud and the CloudFlare alone is half of our consensus.

For IPv6, these are the five largest. And we see that CloudFlare alone is contributing to half our consensus.

So, let's now go through the Anycast based approach and what its results are.

So as I mentioned we do this with multiple protocols and on the bottom left you can see the total of prefixes found through this methodology. There is 27,000 prefixes found. There is 9,000 found with TCP probing and there is 8,000 found with UDP probing. And this bar graph shows the complement of the intersections. The blue bars are only found with one protocol. This is because the prefixes is unresponsive to TCP and UDP for example. So the first bar we see that there is 16,000 Anycast prefixes that is only discoverable with ping. The green bars are discoverable with two protocols. And the red bars is discoverable with all protocols.

The main takeaway of this graph are that the majority of Anycast is detectable with ping. But there is a considerable amount of prefixes that is found with UDP and TCP approaches, and this was previously not possible with the traditional approaches.

So, this graph shows the CDF of the amount of sites we detect behind Anycast prefixes. So we find that one third of Anycast prefixes in our consensus as 40 or less sites. And the majority of Anycast prefixes, or two thirds has more than 40 sites and these are largely attributed to these Cloud providers, Google Cloud, Amazon, CloudFlare, that have a large geographic coverage.

And as I mentioned, the numeration, it's not perfect, it's a lower bound of the real deployment. We find that it's fairly accurate for any small deployments. Anycast deployments with a few sites behind them. And it's a good indication of size for large deployments.

So, as I mentioned, we do this daily and this allows us to map Anycast over time, longitudinal observations. And we find with this longitudinal analysis, we found cases of Anycast deployment that's regularly changed in size. So one day there is 30 sites behind them. The next day there is suddenly two. We see prefixes that switch between Unicast and Anycast, which we suspect is on demand Anycast. We see cases of BGP prefix hijacking. So where it is suddenly announced in east Africa and it's BGP prefix hijacking. We have seen cases of temporary Anycast. Which we suspect to be anti DDoS services. And we have also seen Anycast outages. We have seen sites go offline or the entire prefix is suddenly unresponsive.
So to give a summary of the consensus.

We created a responsible, scalable and accurate Anycast measurement pipeline. And with this, we provide a daily consensus of Anycast. We detect, numerate and geolocate Anycast and we provide results from two methodologies. And we hope this consensus is useful to the community. I will share a link later.

So, the Anycast measurement tool we developed for the consensus, but we find it's also useful for any Anycast operators to improve their Anycast deployment.

So, so far, we have looked at measuring Anycast deployments externally, but now I'm going to look at how the tool can measure your own Anycast deployment internally.

So, this is useful because it allows Anycast operators to assess the performance of their deployment. So for example, it can do catchment mappings like done with Verfploeter, and this is mapping for each address, the Anycast site that it routes to. And we find that our tool can map the entire support space in a few minutes, which is really useful because then you can do daily or even hourly catchment mappings with ease, it becomes trivial.

And we also find that you can get round‑trip time data. You can get the round‑trip time your Anycast deployment achieves towards the entire Internet and this is done by filtering on the cases where the sender is equal to the receiver. So you probe from all vantage points like we do in our Anycast based approach and then you look at the probes where the origin is equal to the receiver. And then you get round‑trip time data just like regular Anycast ping.

And for our deployment, we find that the average latency is 66, with a median of 24 milliseconds.

And the tool also allows for Unicast probing. You can probe a target from all your sites using their Unicast IP and you can compare the latencies that you found for each site and you can figure out which one is nearest to the client based on latency.

And you can repeat this for all the probe targets to get the actual round trip time achieved with Anycast and the lowest that is achievable with Unicast.

And for our deployment we find that if we take the minimum for each address, we have average latency of 40 milliseconds. But the actual round‑trip time we received with our Anycast address is 66. So this means that it's inflated by 26 milliseconds and this is caused by cases where a target will send back its ‑‑ will reach an Anycast site that is far away when there is one that's closer. That's what causes this inflation.

And with our tool you can find these cases where there is sub optimally routing to one that's further away, and we find that where our case, you can improve the performance with 40%, and this is possible using BGP prepending or selective BGP announcements. And further work is to automatically detect these suboptimal Anycast routing cases and then solve them automatically.

And it also has a bunch of other use cases. So you can identify load distribution, you can probe with multiple prefixes simultaneous, so you can duplicate two prefixes and make a small change to one prefix, and then compare what that change actually does for your deployment. It provides coverage for IPv6, TCP and UDP probing. You can find network regions that experience site flipping. So that is caused by a target that's connecting to your Anycast deployment but its traffic is split between multiple of your Anycast sites. And this is caused by for example load balancers and with our tool you can detect these cases.

And there is a much more possible. So to conclude. We provide a daily census for Anycast. You can access us here using this QR code. We use two methodologies. The Anycast based approach, the latency based approach. We make it publicly available and we developed the measurement tool which we will release soon.

Future work is to refine the pipeline. We want the web interface for live measurements. We want the dashboard to visualise the data, signal based measuring and we also want to look at longitudinal analysis using the years of data hopefully.

Finally, we have a call for contribution. If you are an Anycast operator, please check our consensus, see if your prefix is covered, and let us know anonymously it's also possible, and we're also look to go collaborate, we would like to expand our testbed infrastructure both for Unicast and Anycast. We are interested in economically developing regions, for the measurement tooling feel free to contact us, we are also interested to hear what Anycast operators would like out of such a tool. So you can contact me by e‑mail or you can also approach me, I'll be here this week.

Thank you.

(Applause)

KEVIN MEYNELL: Thank you. Do we have any questions from the audience or online? Several questions.

AUDIENCE SPEAKER: I am Doris Hauser. Great talk, first of all. I really enjoyed it, and I was wondering about the methodology. Have you looked into just doing a traceroute and looking at the geolocation of the second to last hop?

REMI HENDRIKS: Yes, we did this analysis. The problem with traceroute is that a lot of networks use tunnelling and it doesn't allow for, we see the second to last hop also respond with a broken address. So link local address, for example. And we tried to do this for the route servers and it performed horribly, so that's why we have this methodology.

AUDIENCE SPEAKER: I see, thank you.

AUDIENCE SPEAKER: Chris, I am in the RIPE NCC. I wonder if you look talk compared your results with what's available on RIPE IP Map?

REMI HENDRIKS: Yes, so we did a large scale measurement using 500 RIPE Atlas probes compared to the 180 we usually have. And the ‑‑ we ran into some issues with RIPE Atlas. So first of all, the measurement took several days because of, we had to rate limit the amount of traffic isn't out. So we can't do it daily. And we also found that the probing costs increases significantly as we use more probes. But the amount of sites that we detect, it diminishes, there is diminishes returns.

AUDIENCE SPEAKER: There is an API RIPE IP maps it uses all available Atlas data which is mostly based on traceroutes, but it, the traceroutes are deaggregated and all of the hops are processed. It actually uses, by the sound of it, almost exactly the same methodology, so it does the clusters, so it would be interesting to compare what's already in there because you don't have to run new measurements, like it's kind of everything that's been run in Atlas like in the past two weeks or, you know, there is some cut‑off. And it does provide answers similar to the latency side. So it would be interesting to see where there is overlap there. Maybe we could talk.

REMI HENDRIKS: Yes, thank you.

AUDIENCE SPEAKER: CloudFlare. Very nice work. I really like Anycast mapping. My question is about the false positive rate where you detect Anycast where you think it's Unicast. Did you fix the five tuple in the TCP and UDP measurements when you were doing the measurements or did you not do that?

REMI HENDRIKS: Yes, we keep the hash ‑‑ the 5 tuple setting for load balancing. We also did a large scale measurement of load balancers on the Internet by manipulating this field. We're publishing this as a paper. But yes we keep it static and we also improve the false positive, we reduce the number of false positives from this technique because it's a known issue. For example we do synchronised probing now which was not previously done.

AUDIENCE SPEAKER: Cool. And even when it is fixed, do you know why it's still flapping? Like do you have any idea what the root cause of that is?

REMI HENDRIKS: We think there is a case of per packet load balancing. They do not look five tubal but also look at some random fields and we suspect ‑‑ well we see a large number of false positives originate from Microsoft's network which are multihomed and we think they have some internal traffic policy that for some reasons sends back replies to multiple sites. We have also seen cases where the false positives were actually true positives by talking about operators, but the latency based approach is unable to detect it.

AUDIENCE SPEAKER: I ‑‑ we have a chat question, and I'm sorry if I'm going to butcher the name from Rinze Cloke: Will this tool or software you built be available as OpenSource tool on GitHub?

REMI HENDRIKS: Yes, it will be available on GitHub.

KEVIN MEYNELL: Okay. So again thank you very much Remi.

(Applause)
.
That brings us to our final talk of this session. And this is going to be given by Dmitry. Who is working for RedHat. So Dmitry is a principal software engineer working on Fedora which has been chosen to provide the initial support for quantum transition cryptography.

DIMITRY BELYAVSKIY: Hello. As I have been introduced, this is me. I work in RedHat. There, I maintain OpenSSL and I'm also a member of the technical committee. I am not a cryptographer, I am a software developer. I'm not a network engineers, to please consider it when I'm talking about what effects network protocols.

My current work is related to post quantum transition, RedHat operates in European grand project named QUBIP and I believe that some parts of that project related to network are also relevant for the audience.

So, I believe that you recognise these people, Alice and Bob, they were invited to demonstrate how important cryptography is for protecting communication and now we are probably in trouble because of post quantum threats. So let's begin with some clarification.

There are two different terms that shouldn't be mixed. The term quantum cryptography is indicated for cryptography based on quantum mechanics such as for example quantum K distribution, some random name generation so on and so forth. I will not cover this topic in my presentation. I don't understand this area anything at all.

But, post quantum cryptography is completely different beast. It's also known as quantum resistant cryptography or quantum safe cryptography and it's a cryptography that will be safe for using when quantum computers happen. Why do we need a transition at all? In the middle of 1990s, the scientists proved that quantum computers, you can use something like its magic, but usually it's a named quantum computers will break traditional cryptography. Mostly, we are speaking about so‑called asymmetric cryptography, so you understand it's classical RSA algorithms, it's an elliptic curve or digital signature or classical digital signature, and changed basis on the same MAT.

They are used in the most protocol especially as at the handshake phase, and if they are broken the minds the security of the in the network. Again quantum computers are currently sort of magic, and they are not here and hardly will happen here tomorrow. But post quantum transition is near. All the governments are taking measures in this direction, and currently it's expected that post quantum algorithms will become at some moment around 2030. In different countries, there are different forecasts.

The history of implementing open traditional post quantum protocol standard begins in 2016 with the NIST contest. There was 70 participation in the first round, then the amount of candidates was reduced and in 2022 we got a first for algorithms suitable for standardisation, it has taken almost three years since that moment and we got one algorithm suitable for key establishment, and two algorithms for signature, and one more algorithm for signature is still in queue. It will hopefully be standardised at the end of the year.

And it's still ongoing process because, well we can't be sure that new algorithms produce real safety. There were four algorithms selected for round four. Which of them happened to be broken completely without any quantum computers. There is a new contents for additional signature algorithms, 14 algorithms. So, stay tuned!
.
Again, let me briefly numerate what are the standard and the players.
The first is NIST, as I mentioned before, standardised two algorithms for signature and they say that will be, that they expect it to be use in certificates and so on and so forth and they say which is intended to use for signing for example. And they also standardise the algorithms ML KEM for standardised establishment. Now, when all these algorithms are standardised, it's time for IETF to make the final standards of using it in various protocols. The process began also several years ago, but now when we have final standards, we can better understand what to do.

And also, one of the player is hardware group which, for example defines that TTL specification and it means hardware support and the basis for working with hardware.

As I mentioned before, I am not a mathematician, so feel free to download my presentation and click the links. This is the series of blog posts at our site that explains mathematics at the level necessary to understand what it's more and how it's used.
.
New let's switch to the transition challenges. One the main challenges is from the we are going to build a secure similar relying on security. We can't trust classical algorithms any more because as we definitely know that quantum computers, if they ever happen, will break them. And at the same time, we still cannot trust new algorithms as I mentioned before one of them was completely broken. And they are just investigated for too short time to be sure enough that they will not break tomorrow.

It means as a temporary solution we have a so‑called hybrid approach when we combine PQ materials or signatures, what are using the traditional cryptography, got by quantum cryptography to ensure that while one the components provides some level of security, we have the guarantee that cryptography is not broken.

But, that's not the only challenge. The other challenge is related to the parameters of the new algorithms. First, and what will affect the network protocols significantly, is that new algorithms all have big keys signature sizes. If we rely on say 128 bits of security, for RSA key, which is the biggest key for that level of security, we have less than 400 bytes of the key and signature, each of them. But for ML‑DSA which is expected to have the same level of security, we have significantly larger keys. So we get plus 1 kilobyte on the key and plus 2 kilobytes on the signature.

Next one. The performance of the algorithms, well, CloudFlare showed the measurements that for example change is fast enough, yes it's fast enough. But on the other hand, Google introducing their ML KEM support don't introduce it in the browser because of performance issues. So ‑‑ and it's too early to see whether the performance will be improved. Of course it will be. But, now implementations are imperfect and in general, slower than for traditional.

There of course will be compatibility problems related to middleboxes, related to errors in implementation because classical algorithms have been implemented for decades and the post quantum algorithms are freshly implemented. But as I mentioned before, we also get more specific problems.

So, we have two use cases for this. Digital signature algorithms. DSA is an answer to our question did you connect to a proper peer? Was the e‑mail from a proper person? And does your fresh firmware update come from a trusted source?
.
Key establishment is using for working the symmetric keys to protect the communication. So it means that these components have different model and different counter measures.

For the DSA, the threat model is that an attacker would restore a private key by the public one and impersonate the same. After that, they become able to mount a middle attack and extract signatures in real time.

Counter measures for this attack require new hardware, new trusted root, shaping these trusted roots to each and every device, and the new end user certificates, new domain certificates and so on and so forth. But, this attack is realtime and it not possible without real quantum computers.

ML KEM has a significantly different threat model. And it's more dangerous because that attackers can collect data now. Then they are able to resolve symmetrical cases when these quantum computers are established, and if your secrets are still secrets, they will be revealed.
But luckily, the counter measures are completely on software layer. We expect that it will be easy to fix it.

For example, speaking on TLS, they are widely set experiments of, based on pre‑standard version, cyber, and kyber based hybrids, they are available in Google Chrome, they are available in Firefox, they are provided in CloudFlare, and currently we see the process that this key change is moving to the standard versions. Google already uses ML KEM based hybrid, for example on YouTube.

So let's jump to network problems. We have a certificates, we have not usually ‑‑ usually we have not a certificate but a certificate chain, and comparing two RSA we get much more, much bigger size of this certificate chains, because for each real world certificate has one public key, plus up kilobyte. Signature of the insured plus 2 kilobytes and at least if we are speaking about web, at least two certificates from the certificate transparency log, plus 4 more kilobytes. So it means that chains down internal 22 kilobytes from 3, 4, kilobytes.

Also, network protocols have request response limitations. It's a standard in the QUIC. Don't know how, when it is implemented, this limitation, and it's a recommendation in the DTLS. So it should be investigated and measured.

The traditional problem is related to TCP slow start. The TCP initial send window is about 10 MSS, so it's something like 12 kilobytes. And as you see, 22 certificate chain does not feed this 12 kilobytes. So you can investigate larger MSS, you can rely on a certificate compression which slightly helps. You can reinvent X 509 standard which is processed by Google but it's a problem you will meet.

Again for UDP based protocol, there also will be some congestion issues. And there are more investigation on measuring.

Our last example I would like to cover is the DNSSEC. It's my beloved protocols, all problems in one. So it's suitable for amplification because the small requests causes a huge responses. The RRSIGs hardly feed to one packet now and will definitely not feed to one packet on moving post quantum stuff. There is a proposal AR RF to split RR records and deal with them at application level. And I know that after the lunch, there will be a great presentation about the DNSSEC field experiments. I strongly recommend all the audience to listen to it. I have listened to the version that was done at the IETF.

If you want to make your post quantum experiments, I will briefly talk what we have in Fedora. We use the libok s, the library from the open quantum safe project. It provide low level implementations of all the live post quantum algorithms, so the only standardised. We shape occupy SS provider, it's a plug infor OpenSSL. We also have implemented post quantum container as a part of the QUBIP project I mentioned earlier, so feel free to download and use for experiments. And of course we continue the upstream work with the important libraries for our distribution, OpenSSL and these.

This slide illustrates what's currently in the scope of the post quantum transition for operational system. We will provide post quantum algorithms in libraries, or cryptography is a feature of our many distributions that establish system wide defaults for cryptography. I think it's a quite useful feature application are out of scope of the transition but I will speak about them later. And kernel are currently out of scope but it's we will have to have the post quantum support to there to avoid tampering the process.

Which algorithm to choose? Well the answer is luckily simple for us. There are basically all the algorithms chosen by NIST and the European countries follow this recommendation, as far as I know. They are still deployed kyber based hybrids for ML KEM change but they are in the process of being changed by ML KEM. We expect incompatibilites with the implementation just because there are several of them and I believe that they have some bugs and those bugs are in different places.

I briefly want to mention open SSH which has non‑standard algorithms and has for several years, and they have recently implemented ML KEM algorithms, and we have a fresh list establishing the process for the protocols standardisation, so we believe that they will also rely on the standard.

What can you do for post quantum transition? Test your networks. Establish your ‑‑ see if you can generate MTDSS keys, measure what happens. You will be surprised how much problems you will find, I think.
For case applications, you will have to find which limitations are hard core that again you will be surprised how many applications think that there is no cryptography beyond RSA. When I was asked do you foresee using new API of OpenSSL after a counter question, I understood that they knew that the API that is basically 15 years old. Please use this new API of OpenSSL.

And please work with the IETF and other standard bodies, choose your beloved group, choose your beloved protocol, and make sure that they are post quantum ready, especially I would recommend to see whether RPKI still recommends RSA and nothing beyond RSA, but I am pretty sure that there are much more network protocols, that are not currently ready for post quantum transition.

And this is basically the final slide, which contains the links to the papers and the publications. I strongly recommend to read. Post quantum cryptography for engineers is a nice IETF draft which soon is going to become a standard which explains basic terminology, so on and so forth.

There is also a nice paper, do we need to change some things, which basically has inspired me to make this presentation. There is an IETF draft for post quantum DNSSEC agenda, and the last link is a link to the previous version of the presentation of field measurements on post quantum DNSSEC. But in the next version of my presentation, I will place a link with the presentation that will happen after the lunch.

Thank you very much. Feel free to ask questions.

(Applause)

FRANZISKA LICHTBLAU: So, do we have questions? Yes, somebody is coming up.

AUDIENCE SPEAKER: Hi, everybody. Thank you for your great talk. I have one small question that I was curious about, because you are talking about all this encryption on post quantum area, and also, we know that integrity it also a part of security that's very important. So, and in the signature there is also hashing is a very important part of the whole algorithm. So is there any primatives on post quantum hashing algorithms or do you know?

DIMITRY BELYAVSKIY: Regarding hashing, the metrics ciphers, they are not vulnerable from the point of shore algorithms, so you can still use. There is so‑called (inaudible) algorithms that will slightly improve the attack possibility against symmetric ciphers and the hashes, but current consensus is that you can just increase the length of the key, or the length of the cache. You can safely SHAR 512 where in the areas you currently use SHAR 256 for example. So, with hashes and symmetric ciphers you are basically on the safe side.

AUDIENCE SPEAKER: Thank you.

FRANZISKA LICHTBLAU: Okay. Maybe a high level question from my side. As you most likely notice, most of us aren't experts on this field of following that development, that closely, at least I haven't until this point. You gave a lot of recommendations, calls to actions on stuff, what we could do as a community, as engineers. And as usual, start things early is a good strategy, but can you give a rough timeline, an estimation when we will see the relevance of this in the wild when there will be push to adoption?

DIMITRY BELYAVSKIY: ML KEM based hybrid is already here. You already see it if you try to establish a connection using Google Chrome and connect to CloudFlare, you will with high probability see that ML KEM is here. But the packet for ML KEM is something like slightly bigger than 1 kilobyte. It feeds the standard MTU so it does not ‑‑ it should not cause measure problems. But, for digital signature, again the whole timeline which is currently followed by governments is that 2030 is the hard deadline. But it means that you will see at least in measurable amount in I would say 2026, early, probably late 2025. So it's rather soon, you will meet problems soon.

FRANZISKA LICHTBLAU: Okay. That was a very clear statement. Do we have any other questions? We don't have anyone online. Okay. Then thank you very much.

(Applause)
.
Before everyone runs off, the usual reminders, you can nominate yourself or somebody else with their consent to run for the Programme Committee to help us put all of this together until today, 3:30. Afterwards everyone can vote for the PC and you'd do us a great favour if you also rate the talks so you know what you liked and didn't like so much. And now enjoy lunch.

(Lunch break)