Throughput Issue in Lte-Debugging
Throughput Issue in Lte-Debugging
Third, ask yourself "Do you have knowledge and skills to analyze every and each components you wrote down at step 1?"
It would not be highly possible for you to be the one who knows everything. At least try to get other persons ready to help
you analyze those data.
Fourth, try to indetify important parameters influencing the throughput. The more, the better. Following is an example list
coming from my experience. I will split these factors into two main categories as listed below.
L1/PHY Factors
Description
One simple and obvious rule in throughput is "Higher Layer throughput can never
be larger than Lower Layer throughput". Putting it another way using a specific
case, IP layer throughput can never be larger than L1/PHY layer throughput. It
implies that if you want to get max IP throughput, you have first guarantee that
you have L1/PHY is in such a condition that allows its maximum capacity with no
error. More specifically in LTE, it means a maximum possible transport block size is
allocated in every subframe and there is no HARQ NACK/DTX for any transmission.
In many cases, to check this condition for every subframe is very tedious and time
consuming but without this step, trying to achieve higher layer throughput is
almost meaningless.
I have been asked so many times to troubleshoot various throughput issues
without any information on lower layer information. The first difficulty that I have
was to let them understand why this information is important.
In ideal condition with very good radio signal, we would get higher throughput as
we increase Transport block size (TBS). But if we increase TBS even when radio
condition is poor, the chance of reception failure gets high and it would result in a
lot of retransmission which would lead to throughput degradation.
When we are doing throughput test using test equipment, handling TBS issue is
pretty straightforward since we can explicitly set a certain TBS as you like for each
and every subframe. However, if we are doing the throughput test in live network,
in most case you would not have such controllability. In such a case, we need to
have very detailed logging that shows TBS allocation for each and every subframe.
Code Rate
Code rate can be defined as the ratio of the data rate that is allocated for a
subframe and the maximum data rate that ideally can be allocated in the
subframe. Code Rate may not be considered as a direct factor for throughput. But
in some case, it can negatively influence the throughput. If the Code Rate gets too
high, the probability of CRC error leading to retransmission and in turn leading to
lower throughput. CodeRate tend to start being an important factors from Category
3 100 Mbps throughput and become more common factors from Category 4.
CFI Value
CFI is an indicator telling how many OFDM symbols are used for carrying control
channel (e.g, PDCCH and PHICH) at each subframe. CFI is not a direct factor, but it
can influence on Code Rate which in turn may lead to throughput variation.
CFI = 3 for 1.4, 3 and 5 MHz system bandwidths
CFI = 2 for 10, 15 and 20 MHz system bandwidth
See CFI page on how CFI can influence Code Rate and Throughput.
When UE report lower CQI value than it supposed to be, the reception reliability
may increase a little bit since network would allocate less TBS than UE can handle
but you would get a little bit less throughput than can be achieved with maximum
capacity.
When UE report higher CQI value than it supposed to be, the reception reliability
may decrease and cause reception error if network allocate the max TBS for the
CQI value UE reported.
Transmission Mode
In ideal condition, Transmission mode for MIMO (e.g, TM3, TM4) would lead higher
throughput than the transmission mode for SISO(TM1) or Diversity (TM2).
In most case, this value wouldn't influence much in terms of throughput according
to my experience, but there were some cases in which I had to tweak this value
several times to achieve ideal throughput. It is hard to say whether just large value
is good or low value is good. You may need to tweak this value depending on
situations.
Generally speaking larger TCP window size may help achieving higher throughput
but there can be some overhead about it. Recently a lot of UE or PC TCP stack
keeps changing TCP Window Size dynamically based on its own internal algorithm.
It is good if everything works fine, but it is very hard to troubleshoot if this
dynamic TCP Window Size change causes any problems.
This would not influence much on UDP throughput, but it would influence a lot on
TCP based through (e.g., ftp, http etc.). I strongly recommend you try throughput
with different RTT and see how your device is influenced by this factor. According
to my experience, I see great deal of throughput reduction when the RTT gets over
50~60 ms.
For all the test equipment (probably even in live network), there is limitation on
the number of RLC SDUs, PDCP packets etc. it can process within one TTI. So if the
average IP packet size being pumped from test tool is small, the max throughput
would be lower than expected even when the tool is generating enough number of
IP packets. (Some people test throughput not only with max IP packet size (e.g.
around 1300~1500 Bytes per packet) but also various combination of smaller
packets)
MTU Size
MTU Size is dependent on the capability of each NIC (Network Interface Card) and
this would also related to IP Packet Size. In most case, this value is set to be
1200~1500 and I think Windows default value is 1300. You would need to try
several different values to find which one is the best. (Refer to Setting MTU
Sizesection to change the value on Windows)
Most of IP throughput application has one or more types of internal data buffers.
sometimes especially in very high throughput case, those buffer size setting is very
Data Buffer Size in test Software
important for achieving the targeted throughput and stability of the throughput
(e.g, Internal Transfer Buffer Size, Socket buffer size in FileZilla)
USB Driver
Even in low throughput, I saw many cases that USB driver cause a lot of issues
that results in a poor throughput. In case of very high throughput case, you have
to consider the USB version as well. For example, it would be impossible to achieve
Cat 6 Max throughput (300 Mbps) if you use USB v2, you should use USB v3 to
achieve this level of throughput.
Ethernet Cable/Switch
Most of Ethernet Cable or Switches you have been using would support 10/100
BASE by default. so you would not have much issues with the cables or Switches
up to 100 Mbps throughput. But if you want to achieve the throughput much
higher than 100 Mbps (e.g, Cat 4 Max throughput, 150 Mbps), you have to make it
sure that the cable is CAT 6 cable (supporting Gb ethernet) and all the ports on the
network switch also support Gb ethernet.
Description
Linux vs Windows
If you are OK with around 90% of ideal throughput at IP layer (e.g, around 90
Mbps at Cat 3 max throughput condition), there may not be any issue whether you
use Linux or Windows. But if you want to achieve very close to ideal max, I would
recommend Linux PC.
LTE level throughput (e.g, 100 Mbps and over) is pretty tough task not only for IP
stack, but also the IP application software and CPU utilization. So, in stead of
directly jumping into max throughput test, increase the throughput step-by-step
and check CPU utilization (e.g, you can monitor the CPU utilization using Windows
Task Manager)
Iperf version
This would not make any differences in throughput, but there may be some cases
in which the data does not go through at all in Active mode. In this case, try in
Passive mode.
Lastly, do the test and analysis as much as possible before the problem is find by somebody else. Normally if any problem
happens, almost everybody including me wants to get it solved right away. But solving the throughput related problem
right away is just a matter of luck, not the matter of engineering/science. I don't like any situation which would depend
only on luck. The best way is to analyze the device as in detail as possible and see how each of the factors listed above
influence the throughput of the device. Each of the factors influence in different ways to different device model/software.
This is the only way to find the solution the soonest when the problem happens in the field.