Browse Source

*** various workarounds for exchange being braindead

*** this is is an *untested* rebase; originally submitted against 1.2

among other things, this contains a possible fix for
https://sourceforge.net/p/isync/bugs/22/ and a lot of related reports.

patch by Florian Lombard <f.lombard@montmirail.com>:

Common cfg section:

  * Either skip or fix messages with lines more than xxx bytes
    (typically no more than 9900 bytes with exchange)
    MaxLineLength xxx (in bytes)
    CutLongLines yes|no (fix or skip message)
  * Allow to rescan all mails from a folder, ignoring the last sync
    latest message pulled (usefull when playing with my new settings)
    IgnoreMaxPulledUid yes|no
  * Skip messages with raw binary content (bytes < 0x20 except CR/LF/TAB)
    SkipBinaryContent yes|no
  * Allow to delete non empty folders on slave (when you are sure about
    what you're doing)
    DeleteNonEmpty yes|no

Drivers cfg section (imap only):

  * Suppress Keyword not supported warnings
    IgnoreKeywordWarnings yes|no

The only missing part is long lines cutting when there's CR/LF
convertion (I don't use maildir++)

============

my response:

> Common cfg section:
>
>   * Either skip or fix messages with lines more than xxx bytes
>     (typically no more than 9900 bytes with exchange)
>     MaxLineLength xxx (in bytes)
>     CutLongLines yes|no (fix or skip message)
>
as mentioned before, i'm concerned about the "sledge hammer" approach of
hard-cutting the lines, because that falsifies the messages' content,
which may very well render them unreadable (if it's not plain text).

meanwhile i found that this should at least not invalidate possibly
present signatures, simply because the respective standards require
complete normalization of the contents before signing - specifically to
avoid the problem.

still, a cleaner approach would be encapsulating the message in a MIME
structure. i found in the imapsync FAQ that "reformime -r7" would do
that (i'm not suggesting to use that, but it should serve as a good
example).

i'd be interested in samples of such messages with excessively long
lines to assess what the "target audience" actually is. i would expect
that messages which already are MIME-encoded would not have this
problem. but then, a sloppily encoded multipart text+html mail could
very well be broken as well.

>   * Allow to rescan all mails from a folder, ignoring the last sync
>     latest message pulled (usefull when playing with my new settings)
>     IgnoreMaxPulledUid yes|no
>
that seems to be overkill to me given that it's a workaround and can be
easily achieved by hacking the sync state files, for example by sed'ing
them.
i suppose you implemented this to resume syncing after implementing the
line length workaround?

>   * Skip messages with raw binary content (bytes < 0x20 except CR/LF/TAB)
>     SkipBinaryContent yes|no
>
i know that i suggested that this might be a problem, but i don't
remember whether you reported actual instances of that.
anyway, the treatment should be the same as for messages with excesively
long lines - MIME-encoding (presumably as quoted-printable).

>   * Allow to delete non empty folders on slave (when you are sure about
>     what you're doing)
>     DeleteNonEmpty yes|no
>
i'll consider this.
my biggest concern is that some transient error would falsify the
mailbox list and thus cause the folders to be nuked. similary, a
permanent change in the server configuration would have that effect.
arguably, either wouldn't be so bad as such, as it would destroy only
the replica. however, it would be important to verify that the replica
does not contain any unpropagated mails (as opposed to any mails at all,
as is done currently).

> Drivers cfg section (imap only):
>
>   * Suppress Keyword not supported warnings
>     IgnoreKeywordWarnings yes|no
>
i wonder why a server would bleat about not supporting an optional
feature when it can (and probably does) announce that in a "civilized"
way, too. did these responses appear to be correlated with specific
messages, or did they always come when opening any mailbox?

> diff --git a/src/drv_imap.c b/src/drv_imap.c
> index e24c7d8..10da0cb 100644
> --- a/src/drv_imap.c
> +++ b/src/drv_imap.c
> @@ -1416,6 +1419,16 @@ imap_socket_read( void *aux )
>                                         resp = RESP_NO;
>                                         if (cmdp->param.failok)
>                                                 goto doresp;
> +                               } else if (!strcmp( "BAD", arg )) {
> +                                       resp = RESP_NO;
> +                               warn( "Warning: IMAP command '%s' returned an error: %s %s\n",
> +                                      starts_with( cmdp->cmd, -1, "LOGIN", 5 ) ?
> +                                          "LOGIN <user> <pass>" :
> +                                          starts_with( cmdp->cmd, -1, "AUTHENTICATE PLAIN", 18 ) ?
> +                                              "AUTHENTICATE PLAIN <authdata>" :
> +                                               cmdp->cmd,
> +                                      arg, cmd ? cmd : "" );
> +                                       goto doresp;
>                                 } else /*if (!strcmp( "BAD", arg ))*/
>                                         resp = RESP_CANCEL;
>
this hunk downgrades tagged BAD responses to warnings and suppresses the
subsequent client-side connection drop.
this doesn't seem like a terribly good idea to me - this server response
indicates that the client (allegedly) did something wrong. that may mean
that the subsequent command stream will be interpreted as garbage, which
may have unpredictable effects. it just isn't safe to continue at this
point.
i suppose you implemented this as a workaround before you identified the
line length issue?

============

and a last retour:

>> Common cfg section:
>>
>>    * Either skip or fix messages with lines more than xxx bytes
>>      (typically no more than 9900 bytes with exchange)
>>      MaxLineLength xxx (in bytes)
>>      CutLongLines yes|no (fix or skip message)
> as mentioned before, i'm concerned about the "sledge hammer" approach of
> hard-cutting the lines, because that falsifies the messages' content,
> which may very well render them unreadable (if it's not plain text).

Well you have the choice of just skipping them to allow the sync to
complete if you're concerned about the messages integrity

> meanwhile i found that this should at least not invalidate possibly
> present signatures, simply because the respective standards require
> complete normalization of the contents before signing - specifically to
> avoid the problem.
>
> still, a cleaner approach would be encapsulating the message in a MIME
> structure. i found in the imapsync FAQ that "reformime -r7" would do
> that (i'm not suggesting to use that, but it should serve as a good
> example).

I had a look at that, and found that completely overkill for my usage
(see below)

> i'd be interested in samples of such messages with excessively long
> lines to assess what the "target audience" actually is. i would expect
> that messages which already are MIME-encoded would not have this
> problem. but then, a sloppily encoded multipart text+html mail could
> very well be broken as well.

100% of those messages where having bad html code without line breaks
Non binary attachments where always correctly line wrapped.
It was either poorly done html signatures or even javascript (yeah,
inside an email !)
So I wasn't worried about the integrity of those messages, which where
already breaking the rules, but I needed the contents (messages from
customers we needed to keep)

>>    * Allow to rescan all mails from a folder, ignoring the last sync
>>      latest message pulled (usefull when playing with my new settings)
>>      IgnoreMaxPulledUid yes|no
> that seems to be overkill to me given that it's a workaround and can be
> easily achieved by hacking the sync state files, for example by sed'ing
> them.
> i suppose you implemented this to resume syncing after implementing the
> line length workaround?

Yes it was mainly a flag I used for debugging (editing hundreds of sync
state files wasn't an option)

>>    * Skip messages with raw binary content (bytes < 0x20 except CR/LF/TAB)
>>      SkipBinaryContent yes|no
> i know that i suggested that this might be a problem, but i don't
> remember whether you reported actual instances of that.
> anyway, the treatment should be the same as for messages with excesively
> long lines - MIME-encoding (presumably as quoted-printable).

Those where bogus messages with the raw attachment in binary but with
base 64 headers correctly set.
Near 100% (if not 100%) of those where in the sent folder and are
probably the result of gmail + buggy email client (but you can still
open the attachment with gmail !)

>>    * Allow to delete non empty folders on slave (when you are sure about
>>      what you're doing)
>>      DeleteNonEmpty yes|no
> i'll consider this.
> my biggest concern is that some transient error would falsify the
> mailbox list and thus cause the folders to be nuked. similary, a
> permanent change in the server configuration would have that effect.
> arguably, either wouldn't be so bad as such, as it would destroy only
> the replica. however, it would be important to verify that the replica
> does not contain any unpropagated mails (as opposed to any mails at all,
> as is done currently).

Well, when you are sure about your settings, this can be usefull, as my
users where renaming folders while I was working on the sync
At start I was logging to the mailbox, deleted the folder, and syncing
again.

>> Drivers cfg section (imap only):
>>
>>    * Suppress Keyword not supported warnings
>>      IgnoreKeywordWarnings yes|no
>>
> i wonder why a server would bleat about not supporting an optional
> feature when it can (and probably does) announce that in a "civilized"
> way, too. did these responses appear to be correlated with specific
> messages, or did they always come when opening any mailbox?

Well, "exchange online", that sums it all ...
Tied to specific messages, I guess it happened when there was a word
between bracket in the message subject (no debug log of that)
Happends only one time, when the message is synced.
A rather ugly hack, but I needed clean logs to spot errors.

> i suppose you implemented this as a workaround before you identified the
> line length issue?

I implemented that before the binary content issue
It's exchange which is breaking all the rules that "forced" me to do
that to sync most of the messages
Cutting the connexion instead of reporting the right error is not the
right thing to do, but that's what exchange does (with Error 10 or 11,
but with BAD reponse)
wip/exchange-workarounds-1.4
Oswald Buddenhagen 7 years ago
parent
commit
547de590fe
  1. 10
      src/config.c
  2. 16
      src/drv_imap.c
  3. 103
      src/sync.c
  4. 6
      src/sync.h

10
src/config.c

@ -202,6 +202,16 @@ getopt_helper( conffile_t *cfile, int *cops, channel_conf_t *conf )
conf->sync_state = expand_strdup( cfile->val );
else if (!strcasecmp( "CopyArrivalDate", cfile->cmd ))
conf->use_internal_date = parse_bool( cfile );
else if (!strcasecmp( "CutLongLines", cfile->cmd ))
conf->cut_lines = parse_bool( cfile );
else if (!strcasecmp( "IgnoreMaxPulledUid", cfile->cmd ))
conf->ignore_max_pulled_uid = parse_bool( cfile );
else if (!strcasecmp( "SkipBinaryContent", cfile->cmd ))
conf->skip_binary_content = parse_bool( cfile );
else if (!strcasecmp( "DeleteNonEmpty", cfile->cmd ))
conf->delete_nonempty = parse_bool( cfile );
else if (!strcasecmp( "MaxLineLength", cfile->cmd ))
conf->max_line_len = parse_int( cfile );
else if (!strcasecmp( "MaxMessages", cfile->cmd ))
conf->max_messages = parse_int( cfile );
else if (!strcasecmp( "ExpireUnread", cfile->cmd ))

16
src/drv_imap.c

@ -78,6 +78,7 @@ typedef union imap_store_conf {
char delimiter;
char use_namespace;
char use_lsub;
char ignore_keyword_warnings;
};
} imap_store_conf_t;
@ -1627,6 +1628,7 @@ imap_socket_read( void *aux )
error( "IMAP error: bogus greeting response %s\n", arg );
break;
} else if (!strcmp( "NO", arg )) {
if (!ctx->conf->ignore_keyword_warnings || strcmp( cmd, "Keywords are not supported" ))
warn( "Warning from IMAP server: %s\n", cmd );
} else if (!strcmp( "BAD", arg )) {
error( "Error from IMAP server: %s\n", cmd );
@ -1725,8 +1727,16 @@ imap_socket_read( void *aux )
resp = RESP_NO;
if (cmdp->param.failok)
goto doresp;
} else /*if (!strcmp( "BAD", arg ))*/
resp = RESP_CANCEL;
} else /*if (!strcmp( "BAD", arg ))*/ {
//resp = RESP_CANCEL;
// Ignore BAD Error 10 (or 11) when SkipBinaryContent is not used.
// this doesn't seem like a terribly good idea to me - this server response
// indicates that the client (allegedly) did something wrong. that may mean
// that the subsequent command stream will be interpreted as garbage, which
// may have unpredictable effects. it just isn't safe to continue at this
// point.
resp = RESP_NO;
}
error( "IMAP command '%s' returned an error: %s %s\n",
starts_with( cmdp->cmd, -1, "LOGIN", 5 ) ?
"LOGIN <user> <pass>" :
@ -3671,6 +3681,8 @@ imap_parse_store( conffile_t *cfg, store_conf_t **storep )
store->use_namespace = parse_bool( cfg );
else if (!strcasecmp( "SubscribedOnly", cfg->cmd ))
store->use_lsub = parse_bool( cfg );
else if (!strcasecmp( "IgnoreKeywordWarnings", cfg->cmd ))
store->ignore_keyword_warnings = parse_bool( cfg );
else if (!strcasecmp( "Path", cfg->cmd ))
store->path = nfstrdup( cfg->val );
else if (!strcasecmp( "PathDelimiter", cfg->cmd )) {

103
src/sync.c

@ -376,13 +376,15 @@ copy_msg( copy_vars_t *vars )
static void msg_stored( int sts, uint uid, void *aux );
static void
copy_msg_bytes( char **out_ptr, const char *in_buf, uint *in_idx, uint in_len, int in_cr, int out_cr )
copy_msg_bytes( char **out_ptr, const char *in_buf, uint *in_idx, uint in_len, int in_cr, int out_cr, uint max_line_len )
{
char *out = *out_ptr;
uint idx = *in_idx;
if (out_cr != in_cr) {
/* message needs to be converted */
char c;
if (out_cr) {
/* adding CR */
for (; idx < in_len; idx++) {
if ((c = in_buf[idx]) != '\r') {
if (c == '\n')
@ -391,16 +393,53 @@ copy_msg_bytes( char **out_ptr, const char *in_buf, uint *in_idx, uint in_len, i
}
}
} else {
/* removing CR */
for (; idx < in_len; idx++) {
if ((c = in_buf[idx]) != '\r')
*out++ = c;
}
}
} else {
/* no CRLF change */
if (max_line_len > 0) {
/* there are too long lines in the message */
const char *curLine = in_buf + *in_idx;
int lines = 0;
while (curLine) {
char *nextLine = strchr( curLine, '\n' );
uint curLineLen = nextLine ? (unsigned int)(nextLine-curLine) + 1 : strlen( curLine );
if (curLineLen > max_line_len) {
/* this line need to be cut into smaller lines */
uint line_idx = 0;
while (line_idx < curLineLen) {
memcpy( out, in_buf + idx + line_idx, ((curLineLen - line_idx) < max_line_len) ? (curLineLen - line_idx) : max_line_len );
out += (((curLineLen - line_idx) < max_line_len) ? (curLineLen - line_idx) : max_line_len);
line_idx += ((curLineLen - line_idx) < max_line_len) ? (curLineLen - line_idx) : max_line_len;
if( line_idx < curLineLen) {
/* add (CR)LF except for the last line */
if (out_cr)
*out++ = '\r';
*out++ = '\n';
}
}
idx += curLineLen;
} else {
/* simple copy */
memcpy( out , in_buf + idx, curLineLen );
out += curLineLen;
idx += curLineLen;
}
curLine = nextLine ? (nextLine+1) : NULL;
lines++;
}
//debug("End index %d (message size %d), message size should be %d\n", idx, in_len, *in_idx + out - *out_ptr);
} else {
/* simple copy */
memcpy( out, in_buf + idx, in_len - idx );
out += in_len - idx;
idx = in_len;
}
}
*out_ptr = out;
*in_idx = idx;
}
@ -411,10 +450,47 @@ copy_msg_convert( int in_cr, int out_cr, copy_vars_t *vars, int t )
char *in_buf = vars->data.data;
uint in_len = vars->data.len;
uint idx = 0, sbreak = 0, ebreak = 0, break2 = UINT_MAX;
uint lines = 0, hdr_crs = 0, bdy_crs = 0, app_cr = 0, extra = 0;
uint lines = 0, hdr_crs = 0, bdy_crs = 0, app_cr = 0, extra = 0, extra_bytes = 0;
uint add_subj = 0;
if (vars->srec) {
/* check if the message has too long lines if enabled */
if (global_conf.max_line_len) {
char *curLine = in_buf;
while (curLine) {
char *nextLine = strchr( curLine, '\n' );
uint curLineLen = nextLine ? (unsigned int)(nextLine-curLine) + 1 : strlen( curLine );
if (curLineLen > global_conf.max_line_len) {
if (global_conf.cut_lines) {
/* compute the addded lines as we are going to cut them */
if (out_cr)
extra_bytes += curLineLen / global_conf.max_line_len; // CR
extra_bytes += curLineLen / global_conf.max_line_len; // LF
} else {
/* stop here with too long line error */
warn( "Warning: message %u from %s has too long line(s).\n",
vars->msg->uid, str_fn[1-t] );
free( in_buf );
return 0;
}
}
curLine = nextLine ? (nextLine+1) : NULL;
}
}
if (global_conf.skip_binary_content) {
while (idx < in_len) {
unsigned char c = in_buf[idx++];
if (c < 0x20 && c != '\r' && c != '\n' && c != '\t') {
/* binary content, skip */
debug( "Incorrect byte %u at offset %u/%u\n", c, idx, in_len );
warn( "Warning: message %u from %s has raw binary content.\n",
vars->msg->uid, str_fn[1-t] );
free( in_buf );
return 0;
}
}
idx = 0;
}
nloop: ;
uint start = idx;
uint line_crs = 0;
@ -494,7 +570,7 @@ copy_msg_convert( int in_cr, int out_cr, copy_vars_t *vars, int t )
extra += add_subj ? strlen(dummy_subj) + app_cr + 1 : strlen(dummy_pfx);
}
vars->data.len = in_len + extra;
vars->data.len = in_len + extra + extra_bytes;
if (vars->data.len > INT_MAX) {
warn( "Warning: message %u from %s is too big after conversion; skipping.\n",
vars->msg->uid, str_fn[1-t] );
@ -505,11 +581,12 @@ copy_msg_convert( int in_cr, int out_cr, copy_vars_t *vars, int t )
idx = 0;
if (vars->srec) {
if (break2 < sbreak) {
copy_msg_bytes( &out_buf, in_buf, &idx, break2, in_cr, out_cr );
copy_msg_bytes( &out_buf, in_buf, &idx, break2, in_cr, out_cr, 0 );
memcpy( out_buf, dummy_pfx, strlen(dummy_pfx) );
out_buf += strlen(dummy_pfx);
}
copy_msg_bytes( &out_buf, in_buf, &idx, sbreak, in_cr, out_cr );
//debug ("Calling copy_msg_bytes for the header (0 to %d) with %d extra bytes\n", sbreak, extra);
copy_msg_bytes( &out_buf, in_buf, &idx, sbreak, in_cr, out_cr, 0 );
memcpy( out_buf, "X-TUID: ", 8 );
out_buf += 8;
@ -521,7 +598,7 @@ copy_msg_convert( int in_cr, int out_cr, copy_vars_t *vars, int t )
idx = ebreak;
if (break2 != UINT_MAX && break2 >= sbreak) {
copy_msg_bytes( &out_buf, in_buf, &idx, break2, in_cr, out_cr );
copy_msg_bytes( &out_buf, in_buf, &idx, break2, in_cr, out_cr, 0 );
if (!add_subj) {
memcpy( out_buf, dummy_pfx, strlen(dummy_pfx) );
out_buf += strlen(dummy_pfx);
@ -534,7 +611,10 @@ copy_msg_convert( int in_cr, int out_cr, copy_vars_t *vars, int t )
}
}
}
copy_msg_bytes( &out_buf, in_buf, &idx, in_len, in_cr, out_cr );
//debug ("Calling copy_msg_bytes for the body (at %d) with %d extra byte(s), limit is %d \n", ebreak, extra_bytes, extra_bytes > 0 ? global_conf.max_line_len : 0);
copy_msg_bytes( &out_buf, in_buf, &idx, in_len, in_cr, out_cr, extra_bytes > 0 ? global_conf.max_line_len : 0 );
//debug("Message after %s\n", vars->data.data);
//debug("Good message size should be %d + %d\n",vars->data.len-extra, extra);
if (vars->minimal)
memcpy( out_buf, dummy_msg_buf, dummy_msg_len );
@ -565,7 +645,7 @@ msg_fetched( int sts, void *aux )
tcr = (svars->drv[t]->get_caps( svars->ctx[t] ) / DRV_CRLF) & 1;
if (vars->srec || scr != tcr) {
if (!copy_msg_convert( scr, tcr, vars, t )) {
vars->cb( SYNC_NOGOOD, 0, vars );
vars->cb( SYNC_MALFORMED, 0, vars );
return;
}
}
@ -1279,7 +1359,7 @@ box_confirmed2( sync_vars_t *svars, int t )
svars->chan->name, str_fn[t], svars->orig_name[t] );
goto bail;
}
if (svars->drv[1-t]->confirm_box_empty( svars->ctx[1-t] ) != DRV_OK) {
if (!global_conf.delete_nonempty && svars->drv[1-t]->confirm_box_empty( svars->ctx[1-t] ) != DRV_OK) {
warn( "Warning: channel %s: %s box %s cannot be opened and %s box %s is not empty.\n",
svars->chan->name, str_fn[t], svars->orig_name[t], str_fn[1-t], svars->orig_name[1-t] );
goto done;
@ -1824,7 +1904,7 @@ box_loaded( int sts, message_t *msgs, int total_msgs, int recent_msgs, void *aux
} else {
if (!(svars->chan->ops[t] & OP_NEW))
continue;
if (tmsg->uid <= svars->maxuid[1-t]) {
if (!global_conf.ignore_max_pulled_uid && tmsg->uid <= svars->maxuid[1-t]) {
// The message should be already paired. It's not, so it was:
// - previously paired, but the entry was expired and pruned => ignore
// - attempted, but failed => ignore (the wisdom of this is debatable)
@ -2073,6 +2153,9 @@ msg_copied( int sts, uint uid, copy_vars_t *vars )
ASSIGN_UID( srec, t, uid, "%sed message", str_hl[t] );
}
break;
case SYNC_MALFORMED:
assign_uid( svars, vars->srec, t, 0 );
break;
case SYNC_NOGOOD:
srec->status = S_DEAD;
JLOG( "- %u %u", (srec->uid[F], srec->uid[N]), "%s failed", str_hl[t] );

6
src/sync.h

@ -56,6 +56,11 @@ typedef struct channel_conf {
int max_messages; // For near side only.
signed char expire_unread;
char use_internal_date;
uint max_line_len;
char cut_lines;
char ignore_max_pulled_uid; /* for master only */
char skip_binary_content; /* for master only */
char delete_nonempty;
} channel_conf_t;
typedef struct group_conf {
@ -75,6 +80,7 @@ extern const char *str_fn[2], *str_hl[2];
#define SYNC_BAD(fn) (4<<(fn))
#define SYNC_NOGOOD 16 /* internal */
#define SYNC_CANCELED 32 /* internal */
#define SYNC_MALFORMED 64 /* internal */
#define BOX_POSSIBLE -1
#define BOX_ABSENT 0

Loading…
Cancel
Save