Friday, January 4, 2013

[Salesforce / Apex] POST Mutipart/form-data with HttpRequest

17/10/2014: the solution has been improved. Datails at the end of the post.

Grandma says you cannot post a Mutipart/form-data using an HttpRequest in APEX?
Well, if she says this now you can tell her this is no more true!

All comes from a CloudSpokes challenge (here is the link)...at the time of starting the challenge I was absolutely sure I would have ended up the challenge in less than a day: http gets/posts  are not a big problem in APEX...well so it seemed.

To complete the challenge you had to make 4 REST calls (login, book a new upload, upload the file, set permissions): during testing the last step always failed.

This was the first time I jumped in front of this issue.

If you don't want to know what I did, go directly here.
The first thing I noted was that you cannot send a base64 encoded file to a server expecting a binary file...It wans't that obvious to me, because I've never struggled with file encoding.

The first code was something like this:
 
public static HTTPResponse uploadFile(Attachmnet file)
 {
  String boundary = '__boundary__xxx';
  String header = '--'+boundary+'\n';
     + 'Content-Disposition: form-data; name="data"; filename="'+file.name
     +'"\nContent-Type: application/octet-stream\n\n';

  String footer = '\n--'+boundary+'--';
  
  String body = EncodingUtil.base64Encode(file.Body); //encodes the blob into a base64 encoded String
  
  body = header + body + footer;
  
  HttpRequest req = new HttpRequest();
  req.setHeader('Content-Type','multipart/form-data; boundary='+boundary);
  req.setMethod('POST');
  req.setEndpoint('http://posttestserver.com/post.php?dir=what_a_wonderful_post');   //COOL site to test form uploads
  req.setBody(body);
  req.setTimeout(60000);
  req.setHeader('Content-Length',String.valueof(body.length()));
  
  Http http = new Http();
      return http.send(req);
 }

Then I was all "Eureka! An encoded string cannot be understood if the server needs a binary", so the only thing to do is to make a concatenation of header + file.Body.toString() + footer! This works only if the Blob comes from a text file (i.e. TXT, XML or CSV files): in these cases you don't have any problem...but with binary data all you have is the error:

Blob is not a valid UTF-8 string

I had to find another way.


Searching the web for "uploading binary data using apex" I found those bad links:
  • http://success.salesforce.com/ideaView?id=08730000000Kr80AAC
  • http://boards.developerforce.com/t5/Apex-Code-Development/Image-upload-using-multipart-form-data/td-p/243335
  • http://boards.developerforce.com/t5/Apex-Code-Development/sending-a-non-ascii-file-via-Http-POST/td-p/116662

That leaded to the block of comments you can see in the challenge's dashboard.

I didn't give up anyway. I had all data needed to send the request so I knew the solution was out there.

First thing was to understand if there was a way to merge Blobs types: it is not possible in APEX if you don't have the original data (in that case you use String concatenation or List of Integers concatenation, if you have bynary data in form of intergers list).
So I came up with the idea to merge header, body and footer using base64 encoded version, something like this:

String encoded = EncodingUtil.base64Encode(Blob.valueOf(header))+EncodingUtil.base64Encode(file.Body)+EncodingUtil.base64Encode(Blob.valueOf(footer));
 req.setBodyAsBlob(EncodingUtil.base64Decode(encoded));

I found that sometimes it worked (after a bit I understood that that times I was extremely lucky!!).

Debugging and searching the web (see this post for example) I came to know that a base64 encoded String could have padding characters because the base64 encoding is done using chunks of 3 bytes (see Google for details), and if data is not multiple of 3 bytes this padding in needed.

So I decided to remove the trailing "=" from each encoded chunck of the body request and paste them together. But it's not the proper way to play with encoded base64 strings, as removing trailing padding needs a reencoding of the original data.

The idea was to remove in some way, without messing with the encoded strings, all trailing padding "=".

For the header string it was simple, because it was simple text and I could have added some blank spaces to get an encoded string without "=". That's:

 
  String boundary = '__boundary__xxx';
  String header = '--'+boundary+'\n';
     + 'Content-Disposition: form-data; name="data"; filename="'+file.name
     +'"\nContent-Type: application/octet-stream';

  String headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\n\n'));
  //this ensures no trailing "=" padding
  while(headerEncoded.endsWith('='))
  {
   header+=' ';
   headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\n\n'));
  }

So in practice I add extra spaces before the "\n\n" ending characters till I have an encoded string without padding.

The Blob file is the main problem. I need the unencoded data to get the needed trailing, so I need a String value of the body: even if with that String how can I change the file to avoid the "=" ? As this data can be anything (form txt files to encoded zips), it is not so simple to add some padding character to avoid the "=" padding (not clear I know)...

If the encoded body doesn't contain any trailing "=", now the problem is over, the sum of the encoded header, body and footer works.

The problem is the last 4 bytes of the encoded body. That is from the 0th byte to the N-4th byte of the file I have no problem, becase it is an encoded version without "=" trailing.

How do I encode those last 4 bytes merging them with the footer?

I discovered that the HttpRequest class has a strange behavior: the setBodyAsBlob() and getBody() are complementary for the use I need. That is the following code doens't throw a "Blob is not a valid UTF-8 string" exception:

   Blob body = file.body;
   HttpRequest tmp = new HttpRequest();
   tmp.setBodyAsBlob(body);
   String bodyString = tmp.getBody();
   System.debug('## Output body:'+bodyString );

The result is a messing sequence of characters. Are they properly encoded? Yes they are, this is a kind of test:

Blob decoded4Bytes = EncodingUtil.base64Decode('AA==');
System.debug('FIRST ENCODING: '+EncodingUtil.base64Encode(decoded4Bytes));
HttpRequest tmp = new HttpRequest();
tmp.setBodyAsBlob(decoded4Bytes);
System.debug('LAST ENCODING: '+EncodingUtil.base64Encode(tmp.getBodyAsBlob()));

Using different kind of random encoded data (other that "AA==") the results of encoding, blobbing, httpRequesting (??!!), is always the same.
This is what i needed:

  1. decode the last 4 bytes in blob
  2. append it into an HttpRequest using the "setBodyAsBlob()"
  3. get the body as string with "getBody()"
  4. merge this string with the footer
  5. base64 encode the resulting string
  6. merge the base64 encoding of header, file body (from 0 to N-4th byte), previous merged string
  7. base64 unencoding the resulting string
  8. here you are the Blob you needed!
This is the resulting code:

public static HTTPResponse uploadFile(Attachmnet file)
{
  String boundary = '__boundary__xxx';
  String header = '--'+boundary+'\n';
  body += 'Content-Disposition: form-data; name="data"; filename="'+file.name
    +'"\nContent-Type: application/octet-stream';

  String footer = '\n--'+boundary+'--';
  
  // no trailing padding on header by adding ' ' before the last "\n\n" characters
  String headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\n\n'));
  //this ensures no trailing "=" padding
  while(headerEncoded.endsWith('='))
  {
   header+=' ';
   headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\n\n'));
  }
  //base64 encoded body
  String bodyEncoded = EncodingUtil.base64Encode(file.body);
  //base64 encoded footer
  String footerEncoded = EncodingUtil.base64Encode(Blob.valueOf(footer));
  
  Blob bodyBlob = null;
  //last encoded body bytes
  String last4Bytes = bodyEncoded.substring(bodyEncoded.length()-4,bodyEncoded.length());
  //if the last 4 bytes encoded base64 ends with the padding character (= or ==) then re-encode those bytes with the footer
  //to ensure the padding is added only at the end of the body
  if(last4Bytes.endsWith('='))
  {
   Blob decoded4Bytes = EncodingUtil.base64Decode(last4Bytes);
   HttpRequest tmp = new HttpRequest();
   tmp.setBodyAsBlob(decoded4Bytes);
   String last4BytesFooter = tmp.getBody()+footer;   
   bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded.substring(0,bodyEncoded.length()-4)+EncodingUtil.base64Encode(Blob.valueOf(last4BytesFooter)));
  }
  else
  {
   bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded+footerEncoded);
  }
  
  if(bodyBlob.size()>3000000)
  { 
   //this a "public class CustomException extends Exception{}"
   throw new CustomException('File size limit is 3 MBytes');
  }
  
  HttpRequest req = new HttpRequest();
  req.setHeader('Content-Type','multipart/form-data; boundary='+boundary);
  req.setMethod('POST');
  req.setEndpoint('http://posttestserver.com/post.php?dir=watchdox');  
  req.setBodyAsBlob(bodyBlob);
  req.setTimeout(60000);
  req.setHeader('Content-Length',String.valueof(req.getBodyAsBlob().size()));
  Http http = new Http();
  HTTPResponse res = http.send(req);
  return res;
}

I tested it with different kind of files, dimensions and it always worked. I'd like to know your thoughts.
See ya!

UPDATE

See this improvement to my solution. I'll add the content right here:
public static void uploadFile(Blob file_body, String file_name, String reqEndPoint){
      // Repost of code  with fix for file corruption issue
      // Orignal code postings and explanations
      // http://enreeco.blogspot.in/2013/01/salesforce-apex-post-mutipartform-data.html
      // http://salesforce.stackexchange.com/questions/24108/post-multipart-without-base64-encoding-the-body
      // Additional changes commented GW: that fix issue with occasional corruption of files
      String boundary = '----------------------------741e90d31eff';
      String header = '--'+boundary+'\nContent-Disposition: form-data; name="file"; filename="'+file_name+'";\nContent-Type: application/octet-stream';
      // GW: Do not prepend footer with \r\n, you'll see why in a moment
      // String footer = '\r\n--'+boundary+'--'; 
      String footer = '--'+boundary+'--';             
      String headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\r\n\r\n'));
      while(headerEncoded.endsWith('='))
      {
       header+=' ';
       headerEncoded = EncodingUtil.base64Encode(Blob.valueOf(header+'\r\n\r\n'));
      }
      String bodyEncoded = EncodingUtil.base64Encode(file_body);
      // GW: Do not encode footer yet
      // String footerEncoded = EncodingUtil.base64Encode(Blob.valueOf(footer));

      Blob bodyBlob = null;
      String last4Bytes = bodyEncoded.substring(bodyEncoded.length()-4,bodyEncoded.length());

      // GW: Replacing this entire section
      /*
      if(last4Bytes.endsWith('='))
      {
           Blob decoded4Bytes = EncodingUtil.base64Decode(last4Bytes);
           HttpRequest tmp = new HttpRequest();
           tmp.setBodyAsBlob(decoded4Bytes);
           String last4BytesFooter = tmp.getBody()+footer;   
           bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded.substring(0,bodyEncoded.length()-4)+EncodingUtil.base64Encode(Blob.valueOf(last4BytesFooter)));
      }
      else
      {
            bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded+footerEncoded);
      }
      */
     // GW: replacement section to get rid of padding without corrupting data
     if(last4Bytes.endsWith('==')) {
        // The '==' sequence indicates that the last group contained only one 8 bit byte
        // 8 digit binary representation of CR is 00001101
        // 8 digit binary representation of LF is 00001010
        // Stitch them together and then from the right split them into 6 bit chunks
        // 0000110100001010 becomes 0000 110100 001010
        // Note the first 4 bits 0000 are identical to the padding used to encode the
        // second original 6 bit chunk, this is handy it means we can hard code the response in
        // The decimal values of 110100 001010 are 52 10
        // The base64 mapping values of 52 10 are 0 K
        // See http://en.wikipedia.org/wiki/Base64 for base64 mapping table
        // Therefore, we replace == with 0K
        // Note: if using \n\n instead of \r\n replace == with 'oK'
        last4Bytes = last4Bytes.substring(0,2) + '0K';
        bodyEncoded = bodyEncoded.substring(0,bodyEncoded.length()-4) + last4Bytes;
        // We have appended the \r\n to the Blob, so leave footer as it is.
        String footerEncoded = EncodingUtil.base64Encode(Blob.valueOf(footer));
        bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded+footerEncoded);
      } else if(last4Bytes.endsWith('=')) {
        // '=' indicates that encoded data already contained two out of 3x 8 bit bytes
        // We replace final 8 bit byte with a CR e.g. \r
        // 8 digit binary representation of CR is 00001101
        // Ignore the first 2 bits of 00 001101 they have already been used up as padding
        // for the existing data.
        // The Decimal value of 001101 is 13
        // The base64 value of 13 is N
        // Therefore, we replace = with N
        // Note: if using \n instead of \r replace = with 'K'
        last4Bytes = last4Bytes.substring(0,3) + 'N';
        bodyEncoded = bodyEncoded.substring(0,bodyEncoded.length()-4) + last4Bytes;
        // We have appended the CR e.g. \r, still need to prepend the line feed to the footer
        footer = '\n' + footer;
        String footerEncoded = EncodingUtil.base64Encode(Blob.valueOf(footer));
        bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded+footerEncoded);              
      } else {
        // Prepend the CR LF to the footer
        footer = '\r\n' + footer;
        String footerEncoded = EncodingUtil.base64Encode(Blob.valueOf(footer));
        bodyBlob = EncodingUtil.base64Decode(headerEncoded+bodyEncoded+footerEncoded);  
      }

      HttpRequest req = new HttpRequest();
      req.setHeader('Content-Type','multipart/form-data; boundary='+boundary);
      req.setMethod('POST');
      req.setEndpoint(reqEndPoint);
      req.setBodyAsBlob(bodyBlob);
      req.setTimeout(120000);

      Http http = new Http();
      HTTPResponse res = http.send(req);
}