This paper is not limited to the above definition when it comes to hosts behind firewall, the same can (almost) be said
for any publicly accessed host (or hosts) with a public IP number servicing for instance the Web or the HTTP protocol.
Hosts are identified by an IP address, either as in IPv4 a 32 bit entity or as in IPv6 by a 128 bit entity.
My thoughts took the following path:
Say we get rid of the notion of "port" altogether and introduce four new concepts:
HOST,
SERVICE,
CHANNEL and
REGISTRATION.
What does that do us any good then? Well, if we combine all this with a OBLIGATORY
registration within each site for
every service (daemon) it can handle, we have solved the routing problem and helped each protocol that use our "new"
transport protocols to handle viruallity. If we register each service on each host can handle (this means also each
client that has a service running to be called upon when someone need to use the service, say Skype) and we
have a master daemon within our local network handling the registration process for each daemon needed to communicate
with the 'net, we know how to route each packet from the 'net to within our local network. Note that the registration takes
place behind the firewall and is for the local network only (or local HOST depending on conditions).
Given the concept of
HOST,
SERVICE and
CHANNEL,
together with a
REGISTRATION daemon servicing the local network,
we can start to talk about the protocols needed to implement this.
(This is an ongoing process of evolution, ie "work-in-progress")
We need at least 4 new protocols, a new UDP, TCP and ICMP calles AUDP,
ATCP and AICMP, all ontop of existing
IPv4 protocol.
Note that
IPv6 Protocol
can also be used to transport the protocols described here from one HOST to another HOST. The specific IP version is not
relevant for the discussion of these protocols. As a sidenote, the
HIP protocol
can also be the base of transport without any change to this papper.
The way the new protocols are implemented is also somewhat new, yet it has been around for ages, we use ASCII as much as
possible. Why you might ask, isn't that a waste of bandwidth?...
I was thinking in the following way:
From host 128.100.0.1 (HostA.domain.example.com) to 194.254.12.22 (ClientA.domain2.example.com):
If we need checksum, we simply place it before the "DATA" part, as a
"K<Any-number>". This way we are not limited to any range of answer. If we need to have a specific
algorithm for the checksum, we add an option telling the algorithm name as in:
"OK<Any-Name>", ie "OKSUB16" (which is the default). To ease the algorithm part, it can be skipped when we
send further packets from the same client and identifier. See also Options
The I token (Local identifier), that originates from the process itself, possible from the creation of the SOCKET. It
must be uniquly within the site, ie behind a firewall and should be checked by the registration daemon before being accepted.
The remote host have no way but to trust that identifier, within each host, ie IP address and is of no value other then to
identify the responding process. See also ATCP for more thoughts about this identifier.
If a firewall changes any value in F, T or I, it MAY verify the F and T to be the same as in the IP header before
retransmit the packet to the local network. We can have a cache in the firewall to hold the latest result of such lookup so
that the speed of transmittion can be accelerated... Note, this is implementation depended and not covered here.
This prococol is used when a daemon would like to register itself to handle a
request of any service (or scheme).
The registration from the local net can look like this:
When the daemon would like to be deregistered it can send the following packet:
Up to now I haven't talked about the
CHANNEL concept at all. This for good reason. It is optional most of the time
and defaults to zero (0) when not specified.
When a
SERVICE
needs several incoming chanels in order to operate propperly it can be handled with a SUBCHANNEL,
ie
SERVICE,
CHANNEL,
SUBCHANNEL.
From host 128.100.0.1 (HostA.domain.example.com) to 194.254.12.22 (ClientA.domain2.example.com):
This means that if we get an acceptance of one packet, and a deny for a previous, we must retransmit from the first
denied packet even if we already have sent one or more packets after this already. Even if this is "bad" it is to ensure that
we never miss any packets in the event we receive packets out of order.
We can here see that the W and D tokens follow each other for one side and are reproduced in any accepted or denyed
responce.
In order to reduce congestion, the ATCP protocol also introduces a "windowing" technic that only have a (small) number
of outstanding unaccepted packets. when the window is full, any transmission is sized until a timeout or denying responce
is found, or an accepting packet arrives. This windowing is internal to the protocol handler only and not visibly from the
outside. It is implementation depended if the window is configurable. The window needs to be on the sending side only,
as any sequence number out of order gives a deny responce telling the last accepted sequence number.
We have seen that the ATCP packet have two modes, one starting mode, and one servicing mode. We also have a third
mode, called closing mode.
We have not been talking about the W sequence number. It is started by the first packet and incrised by each data part length
so that we are defining a length of the sent stream. This sequence COULD but SHOULD NOT wrap around during any one
session. This in order to be more robust when transmitting any data. In order to handle the trafic, the inital sequence number
should not be above 2^(max/2)-1 in value where max defines the maximum number of bits the sequence number can hold.
(To be defined)
This prococol is used when any errors should be returned by the other protocols.
The following errors are already defined with this protocol:
1,service unavailable
Note that the "Host unavailable", "Network unreachable", "Ping" and "Ping-reply" are already handled by the normal ICMP
protocol.
the "user", "password" and "resource" are not part of the protocols presented in this papper.
(To be defined)
We can do a migration to this new paradigm in diffrent ways. One way is to implement the needed kernel modules
defining the new protocols, add the client library routines to access said protocols and create some userland programs to administer the kernel
modules (this method is actually prefered).
(To be defined)
Here is the things in the protocols that are missing or not defined fully...
Currently missing things
If we implement this in the REGISTRATION daemon we can have some security
implied as no local client can register itself to overtake any "desired" daemon, for example, our web server, If the normal
daemon is not started for any unknown reason, a client host should not be able to "hijack" that service for that domain
without concent from the network administrator.
The REGISTRATION daemon itself. Here there are some uncertenties, It is thought of in this paper as a daemon taking
care of local network registrations so that the public internet can access the SERVICE that are behind a firewall. This is true
but not conclusive. I have totally lost the notion of HOST local registration. It need to be a
REGISTRATION daemon on every publicly available HOST, as well as every local
available host from within our own local network. In short, every HOST public or local needs a
REGISTRATION daemon.
The HOST local daemon can, if we wish, be the one that register the SERVICE to the rest of the local network as well as
to the public REGISTRATION daemon, using the AREG protocol.
"Privileged access"... this whole area is in this paper non existent, there is no such thing as a priviledged service.
RFC 3168
(ECN congestion control) is not addressed in this papper. Currently, only drop packet to indicate congestion is implied in this
papper. The whole discussion about congestion is left outside of this document, but should really be addressed at some point.
There is nothing in this papper that inhibits congestion control.
Currently partly defined things
(To be defined)
This new network paradigm doesn't in any way remove the existance of UDP and/or TCP even though it has the same
purpose (in a broad sence) as these protocols. Note that
ICMP
MUST still be implemented (Can also be ICMPv6 if the IP is v6). The new protocols and the old can all still coexist if needed.
In time though, this paradigm can win over the old protocols simply by convinience. We handle the current situation, with it's
evolved state of public network and firewalled local networks
and still gives a vision of "service available" at any time. There is nothing preventing the registration daemon to be
implemented in kernel space, even though this paper pre defines it to be in userland. It is after all implemention depended.
In order to achive maximum advantage of these protocols they should be given each a separate protocol number by
IANA so that they can be handled in a firewall and the TCP/IP stack as if within that domain with all it's current implementation
of filtered network and denied access from "bad" hosts defined by network administrator. Nothing within this paper
should point otherwise.
Table of contents
How it started
Paradigm change
The nitty gritty
The REGISTRATION daemon
The protocol AUDP
The protocol AREG
The protocol ATCP
The protocol AICMP
Protocol options
Relation to URI/URL
Migration to new paradigm
Items still unresolved
Conclusion
History of the protocols
How it started
I was reading a book about network protocols and my mind started to wander a bit.
We have today roughly the following paradigm when it comes to adressing hosts and services:
The service available on any one host are identified by a "well-known-port" number ranging from 1 to 65535 ie unsigned
16 bit integer. The ports numbered from 1 to 1023 are so called "priviligeded", ie. they are only available to a process started
by the administrator of the host.
To add to the confusion, any client program (really client process) also have for each accessed service a port open to
receive packets sent to any service on a specific host. We have thus source port and destination port. When it comes to
the TCP protocol, we also have a session id, defined by the protocol in a defined manner (see RFC of
TCP protocol).
This workes reasonbly well...
The internet has evolved from a few hosts with a few services (daemons) running on them to a concept of "The Web",
ie a http service by which one or more hosts service a site in one way or another. Any company (and for that matter home LAN)
is best filtered by a firewall, or you put yourself at risk and/or you can't have more then one machine facing the 'net as you only
payed for one public IP address.
Well, what if you have several domains served by this local network behind the firewall? or similar services on several
client machines such as VoIP or P2P programs? say you have several virtual machines running to handle the different
domains?
Today we solve this by using protocols like UPNP to patch the firewall with a port for our service.
Sometime we need to distribute our port to the 'net in order to be contacted from outside the firewall.
But the "Virtual domain" handling is somewhat more complex and not really handled any good in the current
implementation of transport protocols (ie UDP and TCP). It is solved in the application layer by each of the protocols
needed (or enabled) that function (the viruallity). HTTP has within it's own protocol a tag Host: that tells the daemon
servicing the request what part of the hard drive to use. Most protocols have no notion of virtuallity at all (like HTTPS,
secure web) and makes us adopt a 1-to-1 relation between IP address and domain name. In IPv6 that is possible at least
to use a lot of adresses, but IPv4 hasn't that luxuary.
Some web hosting companies also can't service secure web to it's clients as they themself have short of IP addresses
to start with and/or are having problem registrating ptr records in the DNS.
Paradigm change
Basicly, if we move the URI notion to just above protocol
IP and
(see also
HIP protocol)
to be used by a new UDP, TCP and ICMP we can have
a much easier way of handling the 'net. I'm not saying that we must use URI but the parts that makes them up or most
parts SHOULD be present.
The SERVICE
has an other name;
SCHEME
when we talk URI and will be used interchangably within this document.
We can say that the
REGISTRATION here is like the UPNP prococol used traditionally although it serves also for routing.
The nitty gritty:
We also need a new registration protocol, called AREG,
that each daemon (service) should use in order to tell it is up and
running. This SHOULD also handle the fact that if the host running said registration
service goes down, when it comes up again
it should either ask the local network for all it services to re-register and/or have a permanent local database of all services
that
are up within the local network and enable each of them at startup. The exact way is yet to be determined, possible one can use
both as each registration by itself is a request of service, we can save that state in the
registration daemon so that it can confirm upon next startup that it still holds.
This method has the advantage that if we cross any router boundary on our local network, we are not out of help.
On the other hand, if we broadcast a request for re-registration by the services that are still up
we have a somewhat smaller net load on the local network. The benefit can in most circumstances be neglected though'.
Well, yes and no. To one extent we always waste bandwidth using ASCII compared to BINARY implemented protocols.
But, we gain something also, morphability, ie. a way of changing the protocol to be state aware, easy to implement,
and easy to adopt to new circumstances, not covered here...
Also, the information is mainly in the ASCII form anyway.
Take normal protocols up to IP (v4 or v6 doesn't matter) unchanged, and all application protocols unchanged too.
Below is the protocols implemented using UDP port 52118 as transport protocol, that is; send all traffic to indicated
HOST on port 52118, as this port is connected to the REGISTRATION daemon.
(The protocols used when having native protocol numbers are not described here. They should though).
The protocol AUDP
U0FClientA.domain2.example.com THostA.domain.example.com Shttp I<random-token> D<any-application-protocol>
The responce (should it be nessecery), can thus be:
U0TClientA.domain2.example.com FHostA.domain.example.com I<random-token> D<any-application-protocol-responce>
The way we implement this is thus: <Token><Value><Space>.
The first "U" is to define the protocol AUDP and MUST always be present as the first byte in the packet.
The zero after that is a version number of this protocol and MUST also be present in all protocols in this paper.
Note that the tokens are not in any specific order except the last one "DATA" which is always last. That way we can
handle binary application protocols as well.
We also have both the from as well as the to address in letters, so that the receiving part can verify that the packet
comes from the said host. It SHOULD evaluate to the same IP number used in the IP packet or else it CAN be rejected/dropped.
We also use the received host name when we respond to a packet in AUDP (this is not always needed
in ATCP depending on connection state).
The
SERVICE
is only needed when we initiate the packets, any responce is identified by the lack of
SERVICE identifier.
If the
SERVICE
identifier is missing and we have not sent a request, that mean that the packet is a bogus packet
that can be discarded/dropped.
We SHOULD get an error in the event the service is missing when it is needed or the daemon not running on the host,
stating "Service not available". (Also define packet).
Any packet without either F, T or I token SHOULD not be handled at all, except for optional error, ie they are manatory when
initiating any request, even when we are not expecting any responce.
It is thus up to the administator of the site to verify the packets received (ie, by use of propper program/daemon
ofcource)...
The protocol AREG
It SHOULD only handle requests to register from the local network. If we need to register any daemon from the outside
of the firewall, for security reasons that can be handled by manual intervention of the adminstrator by placing the address
and service in a database used by the registration daemon. How this is handled in detail, is up to the creator of the
registration daemon.
From host 192.168.0.20 (HostA.local.net.example) to 192.168.0.5 (RegisterA.local.net.example):
R0FHostA.local.net.example HHostA.domain.example.com TRegisterA.local.net.example Shttp I<random-token> DRegister
The responce (should it be nessecery), can thus be:
R0THostA.local.net.example FRegisterA.local.net.example I<random-token> DRegistered
The first "R" here tells us that we are sending AREG packets.
The zero after that is a version number of this protocol and MUST also be present in all protocols in this paper.
This tells the registered daemon that it is registered with the local Registrar and that the local Identifier is uniqueue.
If the Identifier is already in use in the local network, it should not be allowed to be used once again before deregister.
Note that the Identifier is not always needed, such as in the AUDP protocol, when the service is identified by service-name.
R0FHostA.local.net.example HHostA.domain.example.com TRegisterA.local.net.example Shttp I<random-token> DDropMe
The responce (should it be nessecery), can thus be:
R0THostA.local.net.example FRegisterA.local.net.example I<random-token> DDropped
This tells the registered daemon that it is not registered with the local Registrar any more and that the local Identifier
is freed.
The use of it is to separate several daemon servicing the same scheme (ie. HTTP or SKYPE) on the same
HOST.
The
CHANNEL is unique within the
SERVICE
only so, any
SERVICE
is uniquly identified by the
SERVICE,
CHANNEL
pair, where the
CHANNEL is optional and defaults to zero (0) within one
HOST
(named that is, not IP address).
It is used with the "C<Any-number>, ie "C1".
Note that it actually can be any number (within reason, like without sign and not to big) and it is the daemon that uses
it to register it in the registration process that defines the number.
It is used with the "CS<Any-number>, ie "CS1".
Note that it actually can be any number (within reason, like without sign and not to big) and it is the daemon that uses
it to register it in the REGISTRATION process that defines the number.
The protocol ATCP
(To be defined)
T0FClientA.domain2.example.com THostA.domain.example.com Shttp I<i-random-token> W<random-sequence-number> D<any-application-protocol>
The responce (should it be nessecery), can thus be:
T0TClientA.domain2.example.com FHostA.domain.example.com I<my-random-token> M<i-random-token> W<my-random-sequence-number> D<any-application-protocol-responce>
The "T" at the begining tells us that we are sending ATCP packets and MUST be the first thing in every packet.
The zero after that is a version number of this protocol and MUST also be present in all protocols in this paper.
The same here goes for the checksum part found in AUDP packet.
In the first responce, a "M identifier is introduced. This tells that we have established the CHANNEL with the remote
process and are hereafter able to change protocol definition.
Note that the I token ALWAYS refers to my token when sending and M token ALWAYS refers to the remote sides
I token. Each side knows now what the remote process is called within
the network, and this enable us to switch to use that tokens instead. The I token as well as the M token are registered in
the remote router together with where this session should go.
The I and M tokens should be regenerated for each session in order to make any hijack attempts harder. In order this
can happend, the router daemon communicate with the registration daemon and changes the tokens on each session for
each service registered. Here we can have a back propagation to the actal daemon servicing the scheme, so that it knows
by what identifier it is called at the moment. This could be several in any given time, not just one at a time.
The W token is a random sequence
number that tells each side the other sides initial sequence number. Hereafter it is incremented each time a D token is sent
by its length.
T0I<i-random-token> M<m-random-token> W<sequence-number> D<any-application-protocol>
The responce can thus be:
T0I<m-random-token> M<i-random-token> W<sequence-number> A
The responce is either A for "accepted" the sequence numbers presented, which is corrected for the last received data size or
NA for denying verification of presented sequence
numbers. When a deny is introduced, or a timeout (same handling) the first unaccepted sequence is presented to be
retransmitted, and thus all packets after this as well.
T0I<i-random-token> M<m-random-token> W<sequence-number> CL
The responce can thus be:
T0I<m-random-token> M<i-random-token> W<sequence-number> A
if we want to speed up the shutdown process, we can accept the accepted closing...
The responce to an accepting responce from the remote side can thus be:
T0I<i-random-token> M<m-random-token> W<sequence-number> A
This last packet that is sent by the initial closing side, tells the remote side that it has seen the closing responce in order
to speed up the closing sequence and freeing the resources more quickly.
The CL token is used to initiate a closing of the stream. the accepted responce confirms that the remote part
has closed the session. The remote part (ie, the one accepting the closing) should hold the session in a zombie state
for a network timeout seconds before freeing the resources connected with the stream. This in order to positivly
reaccnowledge any retransmitted closing requests. The sequence number used in the CL token packet, must be one above
the last accepted data part.
The protocol AICMP
One packet can look like this:
I0FClientA.domain2.example.com THostA.domain.example.com Shttp I<i-random-token> W<random-sequence-number> E<any-error-number,any-error-text>
The "I" at the begining tells us that we are sending AICMP packets and MUST be the first thing in every packet.
The zero after that is a version number of this protocol and MUST also be present in all protocols in this paper.
The error text is optional and can be omited from implementation. When introducing a new error, the text SHOULD be
presented in order to change the implementation in the local network if possible.
The I and W tokens are optional, depending on the available information in the faulty packet. Some tokens can be added
without redefinition, as all optional tokens can be here from the other protocols.
Protocol options
The protocols have options to change the behaviour of certain parts of the protocols. The Checksum algorithm is such an
option. See also Checksum definition.
Relation to URI/URL
The protocols presented here, are not in a strict scense using URI to communicate. But we are using some of it's components
to handle virtual hosts and verifying that a packet comes from the right sender, note that man-in-the-middle-attacks are still
possible for IPv4 as it can't reliably be handled by the IPv4 protocol itself.
We can say that a protocol using these transport protocols can use an URI looking like this:
<SCHEME>
://<user@><password:>
<HOST>
<:
CHANNEL>
</resource>
Migration to new paradigm
An other way is to use UDP as transport and use a now normally not used port say 23528 to transport the new
protocol. This method implement the protocols in userland only, and will need some program wrappers to function with
old daemons. Some library routines will eventually be needed to handle the protocol anyway... This is the method I will
start to implement to see if it can be done, and if so, do some evaluation of the design thoughts presented in this
paper. Hopefully I can implement some daemons doing the protocols while talking to the registration daemon as well
and at some point even do the routing in the registration daemon.
Items still unresolved
How do we register a client machine (or any machine) that are behind a firewall to be a specific public machine?
That is, say we add to our DNS a record describing HostA.ourdomain.com to point to our firewall IP number. When
we do a name resolve by accessing the DNS daemon, it resolv to our public IP number for the firewall. But, how shall we
tell our REGISTRATION daemon that a specific client, say HostB.local.net, are to
handle our HTTP SCHEME in a generic way?
The current thoughts are that either does the daemon itself know what external name it serves, or we register this
in either the REGISTRATION daemon or preferly in a local DNS database in a
predictable way. If we add A TXT record to a local
DNS database, telling that a host also have some other information, like "TXT EX HostA.ourdomain.com SERVE HTTP 0" then the client as
part of getting it's name defined, also knows what external service to handle when mapping the service it starts to the TXT
record in the local DNS. The last zero (0) is the CHANNEL number, and only shown as an example, only nonzero numbers are
REQUIED.
The concept is easily handled by pre define the external name in the REGISTRATION
daemon so that the given client can register it's service for the network as intended. This means that the
REGISTRATION daemon gives the local client daemon the external name associated
with the service. I think I will implement this as a starter..
If it should be, well then it must be homogenous to the rest of the paradigm... The notion of separate things allowed
to separate users, is ofcource not new or unfamiliar, still, the "best practice" way of it is, I beleive yet to be discovered.
We have a whole class of related problems, network administrator has to allow or disallow service based on credentials...
we have the whole trust thing to handle in a graceful manner... The best thing is to make a special provision for it in the
protocols, but not to use it unless we can trust it to satisfactory degree. The services handling this type of problem today,
do it in the application layer. That is still fully valid in the client access area. When it comes to validate the
REGISTRATION
daemon and the SERVICE to be registered, lets say like this: If the service is started by a non priviledged user, to handle some
information, that can be less fatal networkwise then if it where started by a priviledged user in the first place, as a crack
based on a flow in the daemon comes only as far as that user allows. we have the AUTH protocol ready in case the user is
needed to be checked. That doesn't change as far as I can tell. The less number of daemon needed to run as priviligeded user
at all time, the better. Credentials on the other hand can come into play later...
AICMP. This protocol will by nature change as the protocols evolve.
Conclusion
It has been shown that by eliminating the "port" concept in network protocols and forward the names of the
participating HOSTs to both parties, we eliminate some confusion and add some clarity to all protocols, even application
protocols. When we introduces the REGISTRATION proceess, we also enables firewalls to route traffic from the public
internet, to services on a local network not otherwise visibly in the pubic. By describing for each participating HOST on the
local network where the network REGISTRATION daemon is, we also make it possible to do automatic updates of
services handled by the whole local network. It can also be extended (not described here) to also hold for the local
network itself, so that public names of services can be accessed as if they where requested from outside the network.
When we deal with more complex services, those that needs several incoming chanels open, we can also handle this
by describing the SUBCHANNEL in the protocols.
Copyright © 2007 by