! This post is also available in the following language. Japanese

【Internship Report】Design and Implementation of FRRouting IS-IS SRv6 Extension

My name is Naoyuki Tachibana. I participated in the LINE internship joining the Network Development Team of the Verda Department for six weeks from September 6th. During my internship I developed an extension of the SRv6 function for the IS-IS daemon of FRR, which is the open-source routing platform LINE used in the data center. I will introduce the result in this article.

Background

LINE accommodates most of its many services in a private cloud called Verda, which uses SRv6 for network isolation. For details, see the following blog post.

SRv6 of Verda is used with the control plane of a neutron, but the load on neutron is heavy and we wanted to enable the use of SRv6 BGP Control Plane with the routing protocol. Because there was no open-source implementation of a routing protocol extension of SRv6, in the Network Development Team, we developed a management function of SRv6 Data Plane and BGP SRv6 Extension in the open-source routing platform called FRR (For details, see FRRouting/frr/pull/5865). Though a routing protocol extension we will adopt shortly is the L3VPN function of BGP, we also want to make comprehensive contributions to and warm up the community of SRv6 by increasing OSS implementations of the Control Plane of SRv6. This is our long-term goal toward not only activation of the community but also discovery and resolution of essential issues of SRv6.

Therefore, this time, I implemented SRv6 Extension of IS-IS, which is the second most used protocol after BGP in the SRv6 implementation status (draft-matsushima-spring-srv6-deployment-status-11#section-2). (The referenced draft is expired and it is the most correct in the disclosed information)

IS-IS SRv6 Extension

What is SRv6

SRv6 is Segment Routing realized in IPv6. Segment indicates an instruction to a packet: for example, “transfer via a specific node (Node Segment)” and “transfer from a specific router to an adjacent router (Adjacency Segment).” The sender can freely control the path of the packet by freely granting such a Segment to the packet.

An ID called SID (Segment ID) is allocated to uniquely identify a Segment in the network. In the case of SRv6, SID is encoded in the IPv6 address format. The AID of the Segment to be passed through is stored in the IPv6 extension header called Segment Routing Header. SID is divided into three fields: Locator, Function, and Argument.

A locator is used to make an SRv6 node unique. The function is used to make processing in a node unique. SRv6 Function can be defined regardless of the well-known function defined in RFC8986 (see the following table). You can also create a unique Function. The argument is used to store an argument for Function.

FunctionOverviewPurpose
EndRewrite Destination and SRH and find and send Next-hop from RIBNode Segment
End.XRewrite Destination and SRH and find and send it to the specified Next-hopAdjacency Segment
End.TRewrite Destination and SRH and find and send Next-hop from the “specified RIB”Multi VRF Operation
End.DT4Remove SRH and find and send IPv4 Next-hop from the “specified RIB” (NH=4)VPNv4
End.DT6Remove SRH and find and send IPv6 Next-hop from the “specified RIB” (NH=6)VPNv6

Though the bit length of each Locator, Function, and Argument is variable on RFC, it is necessary to note that some limitation is applied in NOS.

Reference: https://www.cisco.com/c/en/us/td/docs/iosxr/ncs5xx/segment-routing/72x/b-segment-routing-cg-72x-ncs540/configure-srv6.html#id_95420

What is IS-IS

IS-IS (Intermediate System to Intermediate System) is a Link State routing protocol and is used as IGP as with OSPF. In IS-IS, the router is called IS (Intermediate System). IS-IS is an unfamiliar protocol in Japan, but it is expected to be non-negligible in the future because, for example, it is adopted in overseas ISP and IS-IS implements new extension functions, etc. faster than OSPF.

In this section, I will briefly explain the important packet formats and types of PDU (Protocol Data Unit) of IS-IS for an understanding of IS-IS SRv6 Extension, and the concept of TLV (Type Length Value). The packet format of IS-IS is as follows.

PDUs of IS-IS are roughly divided into four types:

  • Hello PDU: PDU used to establish neighbors between ISs
  • LSP (Link State PDU): PDU to exchange Reachability
  • CSNP (Complete Sequence Numbers PDU): PDU to recognize the update of LSP
  • PSNP (Partial Sequence Number PDU): PDU used for LSP requests, confirmation responses, etc.

The basic flow of the establishment of the neighbor of IS-IS is as follows:

  1. Exchange Hello PDU to establish neighbor
  2. Exchange path information in LSP and register information in LSDB
  3. Periodically synchronize LSDB (Link State Data Base) using CSNP and PSNP

IS-IS sends information unique to IS added to the variable-length field of PDU in the TLV format. The following shows some examples of information advertised in TLV. For other TLVs, see Code Points (https://www.iana.org/assignments/isis-tlv-codepoints/isis-tlv-codepoints.xhtml) of IANA.

TypeNameDescriptionReference
1Area addressArea address to which IS belongsISO 10589
2IS neighborsIS adjacent to this ISISO 10589
10Authentication Authentication information of LSPISO 10589
14LSPBufferSizeSize of PDU which can be receivedISO 10589
132IP Interface AddressIPv4 interface addressRFC1195
232IPv6 Interface Address IPv6 interface addressRFC5308

What is IS-IS SRv6 Extension

IS-IS SRv6 Extension is an extension function necessary to use SRv6 in the IS-IS network. For details, see the draft (draft-ietf-lsr-isis-srv6-extensions). Briefly speaking, it means that to use SRv6 in the IS-IS network, it is necessary to send the information of Locator, Node Segment, and Adjacency Segment with PDU. The formats of Locator TLV, End SID Sub-TLV (Node Segment), and End.X SID Sub-TLV (Adjacency Segment) are defined in the draft as follows.

SRv6 Locator TLV

To advertise the SRv6 Locator information to the IS-IS network, use SRv6 Locator TLV. The following shows the data format of TLV. The Type of this TLV is 27 and advertised as TLV of LSP. This TLV can advertise multiple SRv6 Locators.

Each Locator is advertised in the following data format. It contains the Prefix of the actual SRv6 Locator, the algorithm number of the policy adopted in IGP, etc.

End SID Sub-TLV

Sub-TLV is TLV used when there is information to be advertised together with TLV. Sub-TLV Type is 5. It is advertised as Sub-TLV of Locator TLV.

End.X SID Sub-TLV

Sub-TLV Type is 43. End.X SID is granted to Adjacency Segment, so it is advertised as Sub-TLV of IS reachability TLV.

Implementation

In this section, I will introduce the actual flow of my development in the hope that it will be a reference for developments and extensions of functions of FRR. The flow of the development is as follows.

  1. Understanding Design of FRR
  2. Organizing Completed State from Draft
  3. Developing Topology for Test
  4. Creating CLI
  5. Creating Actual Content
  6. Implementing Exchange Logic of TLV

The following describes each section in detail.

Understanding Design of FRR

Process Structure of FRR

FRR does not operate in a single process but consists of multiple processes: for example, the core is placed in zebra, the routing protocol processes are on the core, and there is VTYSH… to operate all of them by CLI. The dedicated API called API is used for the exchange of each routing protocol process with zebra. When using ZAPI, the Zebra side and the routing protocol process side are called zserver and zclient, respectively.

Locator Management of SRv6 Manager

As I introduced at the beginning, SRv6 Manager is a function to manage the SRv6 information in Zebra (the core of FRR). SRv6 Manager logically separates one Locator into Locator Chunks and sets ownership to each Locator Chunk. Though there are multiple routing protocol processes, the owner of each Locator Chunk is managed in Zebra, so you can share one Locator among multiple routing protocols.

The newly added ZAPI is used for Locator use requests from the routing protocol processes, but let me omit the explanation. Instead, it will be helpful for a clear understanding to read the source code added in PR because it is not so large.

Reference: https://github.com/FRRouting/frr/pull/5865

Organizing Completed State from Draft

In this section, read the draft of functions and confirm the goal. In this case, aim to achieve the state that SRv6 Locator and End/End.X SID secured from it is shared by LSP (Link State PDU) and the information of all ISs from each router can be referred to from LSDB.

In doing so, also expect the superficial behavior of FRR with IS-IS SRv6 Extension implemented (config to be input and expected resulting output) to some extent.

  • Config to be input
segment-routing
 srv6
  locators
   locator loc1
    prefix 2001:db8:f:1::/64
   !
  !
 !
!
router isis 1
 segment-routing srv6
  locator loc1 ! このisisdはloc1のprefixを用いてsidを切り出す
 !
!
  • Expected output (excerpted)
// RIB
R1> show ipv6 route
I*> 2001:db8:f:1::100/128 End
I*> 2001:db8:f:1::200/128 End.X 2001:db8:1:2::2
I*> 2001:db8:f:1::300/128 End.X 2001:db8:1:3::3
 
R1> show ipv6 route json
[
  "2001:db8:f:1::100/128": {
     "seg6local_action": "End"
  },
  "2001:db8:f:1::200/128": {
     "seg6local_action": "End.X",
     "nexthop6": "xx"
  },
  "2001:db8:f:1::300/128": {
     "seg6local_action": "End.X"
     "nexthop6": "xx"
  },
]
 
// LSDB
R1> show isis database detailed
r1-00-0. 
 SRv6 Locator: 2001:db8:f:1::/64
    End SID: 2001:db8:f:1::100/128 End
 Ext IS Reachability: xxx
    End.X SID: 2001:db8:f:1::200/128 (2001:db8:1:2::2)
    End.X SID: 2001:db8:f:1::300/128 (2001:db8:1:3::3)

Developing Topology for Test

Create the topology of the test environment. FRR provides the function called topotest, which executes configuration and verification when the topology among FRRs and the test to be used for verification are registered. Many test topologies are added daily to keep the functions of FRR fresh. 

Reference: https://github.com/FRRouting/frr/tree/master/tests/topotests

In this IS-IS SRv6 Extension, the following topology is created to confirm the functions. SRv6 Locator is set in R1-4.

Creating CLI 

Before creating content, implement an empty command to make the command recognized by FRR.

R1(conf)> router isis 1
R1(conf-isis)> segment-routing srv6 // Make this visible.
R1(conf-isis-srv6)> locator loc1 // Make this visible.

R1(conf)> router isis 1R1(conf-isis)> segment-routing srv6 // Enable to display thisR1(conf-isis-srv6)> locator loc1 // Enable to display this

CLI of IS-IS is managed in the yang format to enable to reflect display and settings in show CLI by adding information to the path of yang.


~~~~~
container segment-routing {
  description ""Segment Routing global configuration.";
  container srv6 {
    description "srv6 global configuration.";
    leaf locator {
      type string;
      dafault "default";
      description "locator name";
    }
  }
~~~~~

For example, add a yang file in the following format for this SRv6 setting.

{
    .xpath = "/frr-isisd:isis/instance/segment-routing/srv6",
    .cbs = {
        .cli_show = cli_show_isis_sr_srv6, // Show "segment-routing srv6" when "show running-config" is executed.
        .cli_show_end = cli_show_isis_sr_srv6_end,// Display "exit" after showing the srv6 configuration items in "show running-config".
        .create = isis_instance_sr_srv6_create,// Add "container srv6" to yang tree.
        .destroy = isis_instance_sr_srv6_destroy,// Remove "container srv6" from yang tree.
    },
    .priority = NB_DFLT_PRIORITY - 1,
},
{
    .xpath = "/frr-isisd:isis/instance/segment-routing/srv6/locator",
    .cbs = {
        .cli_show = cli_show_isis_sr_srv6_locator,// Show locator name when "show running-config" is executed.
        .modify = isis_instance_sr_srv6_locator_modify,// Add "leaf locator" to yang tree.
    },
},

Then, create a function to operate each interface.

Implementing Exchange Logic of TLV

Firstly, register the data format of TLV in FRR as a structure. The following shows three types of data formats of TLV added this time, and the structure.

Locator TLV:

struct isis_srv6_locator_info {
    struct isis_srv6_locator_info *next; // Pointer to the next locator.
       
    uint32_t metric;
    uint8_t flags;
    uint8_t algorithm;
    uint8_t loc_size;
    struct in6_addr locator;
 
    uint8_t sub_tlv_len;
    struct isis_srv6_loc_subtlvs *subtlvs;
};

End SID Sub-TLV:

struct isis_srv6_sid_end {
    struct isis_srv6_sid_end *next;
 
    uint8_t flags; // The type and length are added to the TLV information when it is sent.
    uint16_t endpoint_behavior;
    struct in6_addr sids[SRV6_MAX_SIDS];// The current FRR SRv6 has a restriction that up to 16 SIDs can be installed.
};

End.X SID Sub-TLV:

struct isis_srv6_sid_end_x {
    struct isis_srv6_sid_end_x *next;
 
    uint8_t flags; // The type and length are added to the TLV information when it is sent.
    uint8_t algorithm;
    uint8_t weight;
    uint16_t endpoint_behavior;
    struct in6_addr [SRV6_MAX_SIDS];
};

Next, implement the interface. To advertise TLV in IS-IS of FRR, it is necessary to implement five interfaces. If these and IS-IS of FRR are implemented, then TLV will automatically be advertised to LSP as needed.

  • pack: Add the information of TLV to steam
  • unpack: Receive the information of TLV from a stream
  • copy: Replicate the structure of the information of TLV
  • free: Open the memory space of the structure of the information of TLV
  • format: Display the information of TLV in database detailed CLI

Here, let me introduce a part of the implementation of a pack, unpack of Lotator TLV as an example.

pack:

static int pack_item_srv6_locator_info(struct isis_item *i, struct stream *s, size_t *min_len)
{
    struct isis_srv6_info *r;
~~~~~  
    stream_putl(s, r->metric); // Add metric information to the stream. l in putl means long.
    stream_putc(s, r-flags); // Add the flags information to the stream. c in putc means char.
    stream_putc(s, r->algorithm); // Add the algorithm information to the stream.
    stream_putc(s, r->loc_size); // Add loc_size information to the stream.
    uint8_t spl = (r->loc_size + 7) / 8;
    stream_put(s, r->locator, spl); // Add locator information to the stream.
~~~~~
}

unpack:

static int unpack_item_srv6_locator_info(uint16_t mtid, uint8_t len, struct stream *s, struct sbuf *log, void *dest, int indent)
{
    struct isis_srv6_locator_info *rv;
    rv = XCALLOC(MTYPE_ISIS_TLV, sizeof(*rv)); // Allocate a memory area for the size of isis_srv6_locator_info.
~~~~~
    rv->metric = stream_getl(s); // Receive metric information from the stream. l in putl means long.
    rv->flags = stream_getc(s); // Receive flags information from the stream. c in putc means char.
    rv->algorithm = stream_getc(s); // Receive algorithm information from the stream.
    rv->loc_size = stream_getc(s); // Receive loc_size information from the stream.
    uint8_t spl = (r->loc_size + 7) / 8;
    stream_get(&rv->locator, s, spl); // Receive locator information from the stream.
~~~~~
}

Operation Verification

In this operation verification, input the config to set Locator in R1-R4 of the topology for verification and verify by topotest whether SID can be confirmed when referring to RIB. In this section, I will introduce the settings and verification content using R1 as an example.

Config example of R1 (excerpted)

// isisd.conf
router isis 1
 segment-routing srv6
  locator loc1
 
//zebra.conf
segment-routing
 srv6
  locators
   locator loc1
    prefix 2001:db8:f:1::/64

FRR has a function to output RIB in the JSON format, compares the output result with the expected output result prepared beforehand, and determines the Pass/Fail of the test based on whether the content is consistent. The test requires the JSON of the expected output. In this case, the following content is expected.

  • seg6local action End is set in one Prefix (End SID)
  • seg6local action End.X is set in two Prefixes (End.X SID)

Test result

Also, check whether the TLV implemented this time is advertised as a part of PDU. topotest has an option to deploy only the test topology without testing. Deploy the topology using this option, cause a pseudo link failure in eth0 of R1 using the ip link command, and check the LSP flowing by pcap when communication is restored.

Firstly, it is advertised as TLV number 27 as shown in the Locator draft.

Next, End SID. It is advertised as sub-TLV 5 of SRv6 Locator TLV. SID is secured from the Locator of R1, that is 2001:db8:f:1::/64.

Lastly, it is advertised as sub-TLV 43 of End.X SID. MT IS Reach TLV. End.X SIDs to R2 and R3, respectively, are secured from the Locator of R1, that is 2001:db8:f:1::/64.

Future Prospect

IS-IS SRv6 Extension deployed this time enables to set SRv6 in the network and advertise the information. However, there remain some points to be improved. For example, there are two drafts associated with IS-IS SRv6 Extension.

The introduction of these technologies enables to make network separation and traffic control of SRv6 more flexible, so I want to add them. I also want to merge this Extension to upstream.

Considerations for Development

Before participating in this internship, I usually wrote as little as several hundreds of lines of code for experiments without any experience of large-scale development and had to complete the development through the verification in time while initially, I did not know the mechanism and internal structure of FRR, SRv6, etc. This may be a rare situation and the conclusion may be mundane, but let me share what I was aware of in this development flow, etc.

Organizing My To-dos in Details Including Learning of Knowledge before Development

List and organize what knowledge and preparation are necessary for the development and verification before starting the development. This can be based on your imagination. Of course, to-dos will be updated in the process of skill learning and development in line with the list, so update the list accordingly. Anyway, it is important to grasp the big picture of the development and visualize your progress.

Prioritizing To-dos

It is necessary to prioritize to-dos: for example, when you come up with what you have to do while proceeding with to-dos, consider the importance for completion, and add it to the list, and do other things if it does not have to be completed right now, like make the code appear clean. It is often the case especially in developments to modify cod made by others such as OSS contributions that the developer reuses an existing code because existing codes are often written cleanly using a library structure. However, if it takes much time, it will be wag-the-dog. Cleaning the code can be done in the last stage, so put top priority on the completion of an operable code such as “Firstly create the container and then the content” and “Create tentative global variables and exchange values with them.”

Actively Sharing Questions

In my opinion, this is the most important point. If you have any questions, ask it in a place where there are the most people. This has the following benefits:

  • Not only members but also other people may answer your question,
  • Even basic questions can provide knowledge to other members

When asking a question, avoid just asking for advice but think for about 15 minutes by yourself, share your hypothesis, and ask whether the hypothesis is correct, which organizes your thoughts and makes the discussion very constructive. Anyway, it is important not to hesitate to ask questions even if you feel them too trivial.

Conclusion

I had never experienced such a quick development to complete the whole process from the learning of technologies to the creation of a roadmap, design, implementation, and verification in one month, but the time flew by really quickly and it was a fulfilling experience. In addition, I had very valuable experiences: for example, knowing the modification to contribute to the open-source routing platform as a team, etc. I also heard about useful technologies for future research and development through discussion with the team members during the resting time, etc. Let me also express my gratitude to the members of the network development team who guided and discussed with me, especially Mr. Shirokura, who assisted me in various processes such as the explanation of the ecosystem surrounding FRR and the creation of the roadmap.