EFFICIENT MOBILE VIDEO TRANSMISSION BASED ON A JOINT CODING SCHEME 1

In this paper, we propose a joint coding design which uses the Symbol Forward Error Correction (S-FEC) at the application layer. The purpose of this work is on one hand to minimize the Packet Loss Rate (PLR) and, on the other hand to maximize the visual quality of video transmitted over a wireless network (WN). The scheme proposed is founded on a FEC adaptable with the semantics of the H.264/AVC video encoding. This mechanism relies upon a rate distortion algorithm, controlling the channel code rates under the global rate constraints given by the WN. Based on a data partitioning (DP) tool, both packet type and packet length are taken into account by the proposed optimization mechanism which leads to unequal error protection (UEP). The performance of the proposed JSCC unequal error control is illustrated over wireless network by performing simulations under different channel conditions. The simulation results are then compared with an equal error protection (EEP) scheme.


INTRODUCTION
Nowadays, the broadband digital transmission of video and their applications present a capital issue.These communications systems, and the fast growth of both the computing capacity of computers and technical data are considerably increasing the number of users.Real-time video applications are moreover different from traditional data communications in the sense that they impose strict end-to-end delay constraints.Further, video packets are, for the most part, of different importance.An application source encoder is typically required by these special functionalities.To this end, many CoDecs (Coder Decoder) have been developed to make the video stream more robust to transmission errors, and compressed more efficiently while reducing computational complexity.All of these are to gratify the requirements of such applications.
H.264/AVC is the latest video coding design which attests a superb flexibility in video communications (Schwarz et al., 2007).Such a design makes the separation between a Video Coding Layer (VCL) and a Network Abstraction Layer (NAL) (Yip et al., 2005).The output of the encoding process is VCL data which is mapped to NAL units (NALUs) prior to transmission.Each NALU makes up a packet where it holds in some numbers of bytes including a header and a payload.The header defines the type of each NALU and the payload contains the related data.In VCL, picture frames are divided into macroblocks (MBs).An integer number of MBs are further grouped to form a slice which can be encoded in order to fit the size of one or more separate NALUs that can subsequently be freely decodable.
The H.264/AVC coder, running in its extended profile mode, puts the data that makes up a slice into three isolated data partitions with different importance in terms of decoded video.Such data partitioning is particularly helpful in many transmission scenarios.It allows unequal error protection (UEP) by using stronger protection of the more important information.UEP is used in many joint source channel coding (JSCC) techniques and cross-layer strategies.
The most important partition in an H.264/AVC bit stream, called partition A, comprises the slice header and the disparity compensated prediction vectors.Therefore, the highest level of error protection is required by this part of data, unlike partition C which does not require strong protection since information in partitions A and B can be used to decipher the data in partition C by using the error concealment (EC) tool available in the H.264/AVC standard.EC is in fact very useful for video decoding and for interested readers, an excellent review of the existing error concealment mechanisms given by Xu and Zhou (2004).In the case where partition A is correctly received, efficiency of EC actually depends on the size of the lost B or C partitions, as concealment or recovery is made easier on small NALUs.However, it is more difficult to recover a long partition as demonstrated in Argyriou et al., (2009).An intra-mode decision algorithm was proposed by (Liu et al. (2012) in order to reduce the computational complexity of intra-frame H.264/AVC encoders.The proposed algorithm achieved 18% to 70% reduction in computational complexity, in comparison with various conventional methods.
Although various error resilience tools such as EC exist in the H.264/AVC coding standard, some channel coding methods must be used in order to make the transmitted video more resilient to transmission errors.For non-realtime video transmission, automatic-repeat-request (ARQ) systems can best be fitted as one can recover from packet losses by repeated retransmissions.Nevertheless, regarding real-time video transmission due to delay constraints, there must be a limitation in the number of retransmissions.Thus, additional error control strategies must then be used in order to ensure reliable transport.Forward Error Correction (FEC) is one of the key protection methods against channel errors.Since the last decade, researchers have already spent much effort on this subject (Piri et al., 2010;Argyriou et al., 2009;Azni et al., 2009;Zhuo et al., 2008;Coudoux et al., 2008).In Zhuo et al. (2008) an adaptive JSCC over wireless channel based on a new rate-quality (R-Q) model of the H.264/AVC and the error protection characteristics of the turbo code is suggested.Azni et al. (2009), proposed a new method of rate adaptation to the allowed maximum channel transmission rate which does not undertake source rate control in the source encoder.Then unequal error protection, using the rate compatible punctured convolutional (RCPC) code is applied at the network abstraction layer to different partitions.They used joint optimization of FEC at the transmitter and EC at the source decoder.However, it should be noted that the protection strategies which are described in these papers are realized and performed at the application layer and do not exploit the mechanisms available in the lower layers of the protocol stack.This is the so-called joint source channel coding (JSCC) scheme.Interested readers can find a review and classification of the JSCC techniques in Sayood et al. (2000).
Although considerable literature has been written in this field, with continuing advances in the development of wireless communication systems such as cellular networks, it becomes clear that devising better methods to increase throughput of systems and to provide better quality of service (QoS) for endusers continues to be challenging (Kuo et al., 2014;Huo et al., 2014;Chandra and Helenprabha, 2014;Vijayan et al., 2015;Pudlewski et al., 2015;Zhou et al., 2015;Weyulu et al., 2016).
For real-time video transmission, we should assess the various protection strategies according to the effect that they can have on the video quality perceived at the receiver side, expressed for instance in terms of a measure of peak-signal-to-noise ratio (PSNR).In Li et al.(2011), a novel reduced trellis algorithm was developed with an important reduction of complexity from the existing Viterbi-based algorithm; all of these to measure the end-to-end Rate Distortion (R-D) points of a frame.Moreover, Chikkerur (2011) introduced a classification scheme for full reference and thereafter, the authors reduced the reference media layer objective video quality assessment methods.The suggested classification was made depending on whether natural visual characteristics or perceptual characteristics were considered.
In this paper we proposed a novel joint coding scheme where the main reference for this work was the technique proposed in Azni et al. (2009).Our proposal was a JSCC scheme based on the Reed-Solomon (RS) forward error correction which was in accordance with the H.264/AVC standard.The proposed method relies on a rate-distortion algorithm controlling the channel rates under a global rate constraint given by the network.We used the data partitioning (DP) tool as presented in Azni et al. 2009 andStockhammer et al. (2004) to allow the use of UEP.We took into consideration the packet size to allocate bandwidth to different video packets without exceeding the channel capacity.The novelty of our proposal resided in the use of FEC in the application layer based on the Shortened RS codes (S-RS).The S-RS code was applied to the video bit-stream by taking into account the packet priority.Simulation experiments over randomly generated noise demonstrated reduction in Packet Loss Rates (PLRs) compared to the Equal Error Protection scheme (EEP).Improvements in the average video peak signal-to-noise ratio (PSNR) of the order of 4dB were obtained.

SYSTEM OVERVIEW
This section describes the details of our proposal, where UEP was used to protect the transmitted video sequence.Figure 1 shows the proposed system architecture.The upper part of Figure 1 represents the transmitter side of the system and the lower part shows the receiver side.By using the H.264/AVC compression standard, the digital video content was compressed and fed to the system input.Then a simple H.264/AVC NALUs packetization scheme which put precisely one NALU in one real-time transmission protocol (RTP) packet was used.Some packetization rules for this mode were adopted as follows: firstly, a NALU was put into the payload of an RTP packet, and secondly, the RTP header values were set as defined in the RTP specification (Schulzrinne et al., 2003).Thereafter, the RTP packet was sent to the lower layers.Also, as shown in Figure 1, to decode a huge number of encoded video sequences, H.264/AVC inserted the concept of parameter sets to realize this decoding.Note that parameter sets were transmitted out of band using the dedicated channel.The reason which led us to believe that sequence parameter sets must be transmitted out of band and free of errors was that its integrity was a critical constraint to source decoding.Note also that other non-VCL units used for carrying enhancement information that were not required by source decoding could be delivered without applying FEC to them.This helped in keeping the added redundancy bits to a minimum.The protection of other VCL units was feasible by using an optimal JSSC scheme where a variable code rate was calculated for each NALU in a given frame.The code rate selection algorithm which was used in this work was inspired from the work that was developed in Azni et al. (2009) which we explain in the next section.We note here that this algorithm used both the type and size of NALUs as criteria to calculate the channel coding rate.Afterwards, source packets were sent to the RS encoder to engender additional redundancy blocks that were added to source code blocks.At the receiver side, encoded NALUs were pulled out from the RTP packets and checked for errors.It should be noted that even a single bit error in the transmitted information of a given frame, would cause source decoding of these frames at the application layer to become strictly impossible.As shown in Figure 1, in such a case, the incorrect and erroneous packets were treated by the FEC decoder process which tried to recuperate them.

JSCC Rate Allocation Algorithm
The first assumption in this work was that the source rate was not controlled dynamically according to some source rate-distortion algorithm across the transmission.This way, we could realize a considerable reduction in the processing requirement on the source encoder.The incoming compressed bit stream was first encoded at the application layer.This was accomplished using H.264/AVC encoder which was run in its extended profile allowing data partitioning.Thereby, each source frame was encoded as a sequence of 3N NALUs, where N was the number of slices in the frame.The JSCC algorithm (Azni et al., 2009) does two tasks, it calculates the optimal channel code rate for each partition, and adjusts the channel bit rate to the capacity of the given line.Therefore, this algorithm is in charge of optimally allocating the available channel rate between source rate and FEC with respect to their importance in order to reduce video distortion at the source decoder output.
Figure 2 illustrates the development and functionalities of the rate allocation algorithm.This figure shows an example of a frame consisting 3 slices that are each encoded into 3 partitions; these results, hence, in a total of 9 partitions each encapsulated in a NALU as can be seen in Figure 2(i).Figure 2(ii) shows the first processing step of the algorithm.This consists of ordering the source packets according to their subjective importance (type and size) and dividing them into two subsets.The first subset comprised of partition A to which the same code rate r 0 was allocated.r 0 was chosen as the available small bandwidth permited to ensure maximum protection of partition A. The second subset was formed by the ordered succession of partitions, corresponding respectively to the B and C partitions, and where the partitions must be enrolled from the most elevated to the less elevated size (descending order).The algorithm then constructed and applied a corresponding code rate vector (r 1 ,.., r 2N ) started by some initialization point.Figure 2 (iii), shows the result of the application of the code rate allocation algorithm to the frame's NALUs.The result is shown as a percentage of channel coding relative to the total bit rate in a given partition.The dark areas correspond to the additional percentage of the bandwidth that was used for channel coding.The percentage of redundancy bits added to partition A is always relatively appreciable the compared to total NALU size, whereas for partitions B and C this depends on the size and type of partition.The detailed description of the complete JSCC allocation algorithm can be found in Azni et al. (2009).

Unequal Error Protection (UEP) at Application Layer
Only one NALU of the types described in Shulzrinne et al. ( 2003) should be included in the single packet specified here.In other words, neither an aggregation packet nor a fragmentation unit can be used within a single NALU packet.A NALU stream composed by de-encapsulating single NALU packets in RTP sequence number order must conform to the NALU decoding order.The NALU header is designed to conserve the payload header of an RTP payload format.
We implemented the unequal FEC mechanism found in the S-RS code; this was done at the application layer.RS codes were chosen because they have good error correcting properties and are amply utilized in FEC schemes.We could consult the essential concept of the deletion code which is described in Hafner et al. (2005).Generally speaking, a (n, m) RS code comprises m source blocks and (n − m) parity blocks.The key idea behind this code is that any subset of m encoded blocks is sufficient to remake source data knowing that the correction capacity of the code is (n − m) / 2. In the actual implementation, we used systematic codes where the first m of the (n + m) encoded blocks and the m source blocks were similar.Each NALU packet was RS encoded using the calculated rate r i from the channel allocation, where r i was the channel coding rate applied to the i th NALU of the picture.Basically, an RS(50/r i , 50) S-RS code, defined in Galois Field, GF(256) was used, while a RS(18, 12) S-RS code was utilized for the RTP header.Therefore, it can be concluded that any RS block can detect and correct up to (50/2r i ) byte errors.All of the original video blocks can be decoded successfully, as long as the number of lost blocks does not exceed the corrector capacity.Note that blocks that could not be recovered by the channel decoder were simply dropped because they were useless for the video source decoder.Note also that our UEP strategy was designed such that most NALUs corresponding to partition A were recovered.NALUs corresponding to partitions B and C that could not be recovered by the channel decoder would be reconstructed using the H.264/AVC EC tool (Xu et al., 2004).

Simulation Model
The encoder test model JM10.1, or more exactly, a modified one was utilized in our work.This version was supplied by the Joint Video Team (JVT).The reason for adopting JM10.1 was not that it is the latest available test model but because it supports data partitioning and error concealment.We made a comparison between our unequal protection system and a system with an equal error protection (EEP) for all source packets.Regarding the EEP approach, it should be noted that only one channel coding rate was applied over the packets.Concerning both systems, two QCIF sequences, namely Mobile Calendar and Tree, have been firstly encoded, then transmitted, and finally decoded.To obtain pictures encoded in 3 slices each, we fixed the number of MBs per slice to 33, because an encoded picture in a QCIF video sequence includes 99 MBs.In addition, we also chose the extended profile of the encoder which authorized us to encode each slice into 3 NALUs using the data partitioning encoding mode.Therefore, each picture was encoded into 9 NALUs.
The task of the rate allocation process is to maximize the expected PSNR.The reported average PSNR is the arithmetic mean over the PSNR of each decoded or concealed frame.The bit-rate reflects the overall bit-rate R c including channel coding and source bit-rate.
The baseline comparison EEP method, was designed using the RS code with average fixed channel coding rate that is calculated as given by equation ( 1) where, R c is the channel rate and R s (n) is the number of bits in the n th picture.
In Figure 3 we see the execution results of the code rate allocation algorithm with a value of R c that is equal to 180 Kb/frame by using the test sequence Mobile Calendar.The figure shows the vector average yield of partitions B and C of the entire video sequence and the average yield of the EEP code rate.It should be noted that the initial rate vector was set to (k/85, k/80, k/75, k/70, k/65, k/60, k/55, k/50), and symbol size was fixed to 50 Bytes.The results in figure 3 show clearly that the code rate allocation to NALUs show efficient performance, where small code rates are assigned to high priority units in the case of UEP, compared to EEP where the channel coding rate remains constant.

 
Figure 4 shows the comparison of variations of PLR resulting from UEP protection.One can note that partition A experiences small PLR even at relatively small values of signal to noise ratio (SNR).This contributes greatly to the efficiency of error concealment algorithms that could be run at the H.264/AVC source decoder.

Simulation Results and Discussion
The simulation results for the two video transmission techniques that are considered in this work; UEP and EEP forward error correction coding.Then we evaluate the performance of the proposed method by performing measurements of PLR at the output of the channel decoder using both protection techniques, EEP and UEP.We can consider PLR as one of the most capital parameters of this study, thanks to its ability to quantify the coding efficiency.The video is compressed using the H.264/AVC encoder in its extended profile, enabling DP.The encoded video stream is channel encoded by a rate compatible punctured RS code, depending on the UEP and EEP schemes.Noise is added to the encoded sequences.Then we analyse the PLR that results in both cases, before and after the channel decoding in order to compare the effectiveness of the two correctors.In each group of experiments, we let the received value of SNR to vary from 2.5 dB to 10 dB in step size of 0.5.In both cases, it can be noted that the PLR is improved through channel error corrections.Indeed, errors before channel decoding cannot be fully corrected.

Packet Loss Rate (PLR)
PLR has a direct effect on the multimedia transmission quality and an indirect effect on data transmissions which use typical transmission control protocol (TCP).For this reason, we proposed to make a comparison between the proposed coding scheme and the EEP coding scheme.It should be noted that we considered that if only one bit is received erroneously in a given packet, then the whole packet is considered as lost.We characterized this factor being like a ratio of the number of lost packets to the total number of transmitted packets.Variations of PLR of partitions A, B and C under different channel SNR, using EEP and UEP schemes are presented in Figures 5, 6 and 7.Each result is obtained as the average over 100 runs.In all cases, PLR is in reduction when SNR is in augmentation.Thus, asymptotically, we can see that FEC using UEP has an unequal protection capability of the three types of NAL units.In Figure 5, UEP furnishes elevated protection of partition A in comparison with EEP whatever the value of SNR, unlike partitions C that are better protected by EEP than UEP (Figure 7).
Partitions B are better protected by UEP between SNRs in the range of 0 to 4.5 dB (see Figure 6).Above 4.5 dB, the protection offered by UEP decreases and gives lower quality compared with EEP.In Figure 6, small NALUs make their recuperation more efficient when the channel is too noisy, while the UEP scheme yields better protection to large NALUs.However, at the moment where the channel becomes less noisy, both large and small NALUs can be recuperated by the average code rate of the EEP scheme.On the other hand, UEP engenders loss of small packets which also suffer neglect by this scheme.This allows us to understand why EEP is more effective at the high SNR region and UEP gives minimal PLR values at low SNR.

Peak-Signal-to-Noise Ratio (PSNR) and Visual Quality
As a final comparison criterion, one can also observe the PSNR of received video sequences compared to the original sequences in both scenarios (UEP and EEP), with the intention to visualize the effect on our technique on the reconstructed quality.The comparison results of the two schemes with different channel SNR are shown in Figures 8 and 9. Typical values for the PSNR in loss image and video compression vary between 30 and 35 dB, where higher is better.Acceptable values for wireless transmission quality loss are considered to be about 20 to 25 dB.We can clearly notice from Figures 8 and 9 that compared with the fixed channel coding rate scheme, our proposed coding scheme can achieve a higher reconstructed quality of 32db at the receiver side.It can recuperate most of the A partitions which serve the task of making the concealment of partitions B and C by the source decoder more efficient.(c) and 10(d) represent the same decoded frame by using the EEP scheme in the same channel conditions.We can easily note the contrast between the decoded visual qualities under the proposed UEP scheme and EEP.Therefore this clearly shows that the proposed UEP shows better performance.At this stage, it is interesting to note that we experienced 82 dropped frames whose decoding was impossible when the transmitted video channel was encoded using the EEP scheme.However, in our UEP scheme it was possible to realize a successful decoding of the whole video sequence (299 frames).

CONCLUSION AND FUTURE WORK
In this paper, we propose a joint source channel coding approach to H.264/ AVC video transmission over wireless networks.The proposed technique employ both the data partitioning mode and error concealment of the H.264/ AVC extended profile.Thanks to its efficacy, the second tool is used for most modern video.The proposed scheme is founded on an unequal error protection based on the Reed Solomon codes.The UEP scheme is designed based on the type of NALUs and their size for allocating jointly channel code rates.Our proposal has managed to prove its effectiveness over the equal error protection against the PSNR and visual quality by using strong protection as being important to the considered packets.The proposed scheme has been proven by simulation results to be much efficient even in critical situations where the receiver SNR is low.
For further work we believe that an interesting issue is to integrate FEC in the Medium Access Layer (MAC) using the RS codes; unrecoverable packets would then be passed to the application FEC decoder process which would have the task of trying to recover them.On another side, it would be also interesting to compensate the application of FEC by a low-overhead ARQ.
When an uncorrectable error is detected, a selective ARQ system would request retransmission but only for the uncorrectable partitions A packet which would cause a decrease in the frequencies of retransmission as well.This combination has the advantage of offering higher reliability than an FEC system alone and higher throughput than a system with ARQ only.Last but not least, an optimization of the code rate allocation algorithm could possibly make the computations less complex and this would lead to savings of energy consumption.

Figure 3 .
Figure 3. Average code rate allocated to partitions A, B and C.

Figure 4 .
Figure 4. Packet loss rate variations for partitions A, B, and C.

Figure 5 .
Figure 5. PLR of partitions A under EEP and UEP, using test sequence Mobile Calendar.

Figure 6 .
Figure 6.PLR of partitions B under EEP and UEP, using test sequence Mobile Calendar.

Figure 7 .
Figure 7. PLR of partitions C under EEP and UEP, using test sequence Mobile Calendar.

Figure 8 .
Figure 8.Comparison of receiver PSNR as a function of SNR, using video sequence Tree.

Figure 9 .
Figure 9.Comparison of receiver PSNR as a function of SNR, using video sequence Mobile Calendar.

Figures
Figures 10(a) and 10(b) show the visual quality of the decoded frame (No-96) in the case of using the UEP scheme where an SNR of 4.5 dB is used and where test sequences Tree and Mobile Calendar are used respectively.Figures 10(c) and 10(d) represent the same decoded frame by using the EEP scheme in the same channel conditions.We can easily note the contrast between the decoded visual qualities under the proposed UEP scheme and EEP.Therefore this clearly shows that the proposed UEP shows better performance.At this stage, it is interesting to note that we experienced 82 dropped frames whose decoding was impossible when the transmitted video channel was encoded using the EEP scheme.However, in our UEP scheme it was possible to realize a successful decoding of the whole video sequence (299 frames).

Figure 10 .
Figures 10(a) and 10(b) show the visual quality of the decoded frame in the case of using the UEP scheme where an SNR of 4.5 dB is used and where test sequences Tree and Mobile Calendar are used respectively.Figures10(c) and 10(d) represent the same decoded frame by using the EEP scheme in the same channel conditions.We can easily note the contrast between the decoded visual qualities under the proposed UEP scheme and EEP.Therefore this clearly shows that the proposed UEP shows better performance.At this stage, it is interesting to note that we experienced 82 dropped frames whose decoding was impossible when the transmitted video channel was encoded using the EEP scheme.However, in our UEP scheme it was possible to realize a successful decoding of the whole video sequence (299 frames).

Figure 10 .
Figure 10.Comparison of visual quality of decoded frames sequences using Mobile Calendar.(right) and Tree (left) by transmission over an additive white Gaussian Noise channel, using channel coding schemes: EEP (bottom) and UEP (top).