\listinginfo{}{msg}{is the object payload containing the actual message written by a user. The term 'message' is never used to describe information exchange between users in this document. 'Content' is mostly used instead.}{}
\listinginfo{}{payload}{There are two kinds of payload: message payload for message types, e.g. containing inventory vectors, and object payload, which is distributed throughout the network.}{}
\listinginfo{}{object}{is a kind of message whose payload is distributed among all nodes. Somtimes just the payload is meant.}{}
\subsection{Process Flow}
The newly started node \node{A} connects to a random node \node{B} from its node registry and sends a \msg{version} message, announcing the latest supported protocol version. If \node{B} accepts the version\footnote{A version is accepted by default if it is higher or equal to a nodes latest supported version. Nodes supporting experimental protocol versions might accept older versions.}, it responds with a \msg{verack} message, followed by a \msg{version} message announcing its own latest supported protocol version. Node \node{A} then decides whether it supports \node{B}'s version and sends its \msg{verack} mesage.
If both nodes accept the connection, they both send an \msg{addr} message containing up to 1000 of its known nodes, followed by one or more \msg{inv} messages announcing all valid objects they are aware of. They then send \msg{getobject} request for all objects still missing from their inventory.
\msg{Getobject} requests are answered by \msg{object} messages containing the requested objects.
A node actively connects to eight other nodes, allowing any number of incoming connections. If a user creates a new object on node \node{A}, it is offered via \msg{inv} to eight of the connected nodes. They will get the object and distribute it to up to eight of their connections, and so on.
The messages, objects and binary format are very well discribed in the Bitmessage wiki \cite{wiki:protocol}, the message description is therefore narrowed down to a description of what they do and when they're used.
A \msg{version} message contains the latest protocol version supported by a node, as well as the streams it is interested in and which features it supports. If the other node accepts, it acknowledges with a \msg{verack} message. The connection is initialized when both nodes sent a \msg{verack} message.
\subsubsection{addr}
Contains up to 1000 known nodes with their IP addresses, ports, streams and supported features.
\subsubsection{inv}
One \msg{inv} message contains the hashes of up to 50000 valid objects. If your inventory is larger, several messages can be sent.
\subsubsection{getdata}
Can request up to 50000 objects by sending their hashes.
\subsubsection{object}
Contains one requested object, which might be one of:
\listinginfo{}{getpubkey}{A request for a public key, which is needed to encrypt a message to a specific user.}{}
\listinginfo{}{msg}{Content intended to be received by one user.}{}
\listinginfo{}{broadcast}{Content sent in a way that the Addresses public key can be used to decrypt it, allowing any subscriber who knows the address to receive the such a message}{}
People looking at the PyBitmessage's source code might be irritated by some other messages that seem to be implemented, but aren't mentioned in the official protocol specification. \msg{ping} does actually cause the node that implements this to send a \msg{pong} message, but this feature isn't actually used anywhere. \msg{getbiginv} seems to be thought for requesting the inventory, but as I understand it can't be used. \cite{issue:112}
\textit{BM-2cXxfcSetKnbHJX2Y85rSkaVpsdNUZ5q9h}: Addresses start with "BM-" and are, like Bitcoin addresses, Base58 encoded\footnote{Which uses characters 1-9, A-Z and a-z without the easily confused characters I, l, 0 and O.}.
\listinginfo{}{ripe}{Hash of both public signing and encryption key. Please note that the keys are sent without the leading 0x04 in \obj{pubkey} objects, but for creating the ripe it must be prepended. This is also necessary for most other applications, so it's a good idea to do it by default.}{ripemd160(sha512(pubSigKey + pubEncKey))}
\listinginfo{}{checksum}{First four bytes of a double SHA-512 hash of the above.}{sha512(sha512(version + stream + ripe))}
Bitcoin uses Elliptic Curve Cryptography for both signing and encryption. While the mathematics behind elliptic curves is even harder to understand than the older approach of multiplying huge primes, it's based on the same principle of doing some mathematical operation that can be done fast one way but is very hard to reverse. Instead of two very large primes, we multiply a point on the elliptic curve by a very large number\footnote{Please don't ask me how to do it. If your're crazy enough, start at \url{http://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication} and \url{http://en.wikipedia.org/wiki/Elliptic_curve_cryptography}. If you're not that crazy, use a library like Bouncy Casle that does the hard work for you.}.
The user, let's call her Alice, needs a key pair, consisting of a private key
$$k$$
which represents a huge random number, and a public key
$$K = G k$$
which represents a point on the agreed on curve\footnote{Bitmessage uses a curve called \textit{secp256k1}.}. Please note that this is not a simple multiplication, but the multiplication of a point along an elliptic curve. $G$ is the starting point for all operations on a specific curve.
Another user, Bob, knows the public key. To encrypt a message, Bob creates a temporary key pair
$$r$$
and
$$R = G r$$
He then calculates
$$K r$$
uses the resulting Point to encrypt the message\footnote{A double SHA-512 hash over the x-coordinate is used to create the actual key.} and sends $K$ along with the message.
When Alice receives the message, she uses the fact that
The exact method used in Bitmessage is called Elliptic Curve Integrated Encryption Scheme or ECIES, which is described in detail on Wikipedia (\url{http://en.wikipedia.org/wiki/Integrated_Encryption_Scheme}).
To sign objects, Bitmessage uses Elliptic Curve Digital Signature Algorithm or ECDSA. This is slightly more complicated, if you want the details, Wikipedia is once again a fine starting point: \url{http://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm}.
Bitmessage doen't really scale. If there are very few users, anonymity isn't given anymore, and with many users traffic and storage use grows quadratically.
Proof of work has two uses. It helps to protect the network by preventing single nodes from flooding it with objects, and to protect users from spam. There's minimal proof of work required for the network to distribute objects, but users can define higher requirements for their addresses if they get spammed with cheap Viagra\texttrademark{} offers. The proof of work required for an address is defined in the \obj{pubkey}, and senders that are in a user's contacts should not be required to do the higher proof of work.
The difficulty is calculated from both message size as well as time to live, meaning that a message that is larger or stored longer in the network will be more expensive to send.
$$ d =\frac{2^{64}}{n (l +\frac{t l}{2^{16}})}$$
\begin{tabular}{@{}>{$}l<{$}l@{}}
d & target difficulty \\
n & required trials per byte \\
l & payload length + extra bytes (in order to not make it too easy to send a lot of tiny messages) \\
t & time to live \\
\end{tabular}
To do the proof of work, a nonce must be found such that the first eight bytes of the hash of the object (including the nonce) represent a lower number than the target difficulty.
\subsubsection{Message Size Limitation}
To prevent malicious users from clogging individual nodes, messages must not be larger than 256 KiB. Because of the proof of work, large objects arent' practical for normal use, but might be used to occupy nodes by sending them garbage.
The intended solution for this problem is splitting traffic -- addresses, more precisely -- into streams. A node listens only on the streams that concern its addresses. If it wants to send an object to another stream, it just connects to a node in this stream to send the object, then disconnects. When all active streams are full, a new one is created which should be used for new addresses.
The unsolved problem is to determine when a stream is full. Another issue is the fact that, as the overall network grows, traffic on full streams still grows, as there are more users who might wanto to write someone on the full stream.
Obviously it's trivial for an attacker to collect all (encrypted) objects distributed through the Bitmessage network -- as long as disk space is not an issue. If this attacker can somehow get the private key of a user, they can decrypt all stored messages intended for that user, as well as impersonate said user\footnote{The latter might be more difficult if they got the key through a brute force attack.}.
Plausible deniability can, in some scenarios, help against this. This action, called "nuking an address", is done by anonymously publishing the private keys somewhere publicly accessible\footnote{See \url{https://bitmessage.ch/nuked/} for an example.}.
Perfect forward secrecy seems impractical to implement, as it requires to exchange messages prior to sending encrypted content. That would in turn need proof of work to protect the network, resulting in twice the work for the sender and three times longer to send --- that is, if both clients are online. Exchanging messages would be all but impossible if both users are online sporadically.