Discussion:
LDIF parser disable auto base64 encode
(too old to reply)
Ab
2018-06-05 13:04:37 UTC
Permalink
Raw Message
Hello,
I wrote a simple assemblyline to read object from a LDAP server A and write them to another LDAP server B. What I did not realize at the time was that if there is a trailing space SDI will automatically encode the attribute to base64. Which does not matter for SDI as it will encode/decode as required every time you operate within SDI but third party apps that consume the data are not so designed. My data seems to have strings, numbers and multivalued attributes. This was supposed to be a one time thing so I have used the * -> map all attributes.

Is there an easy way to stop SDI from making the conversion? Is there a easy way to go back and fix the attributes where the conversion has already taken place?

I suppose one way would be for me to explicitly define each attribute and to trim() all the string attributes so that I dont mess up other attribute types. But was wondering if there is a better way?
yn2000
2018-06-05 15:13:52 UTC
Permalink
Raw Message
Hi there,
I wonder there is a missing info here, because the story does not add up.
- First, the LDIF parser in the title. What LDIF parser has anything to do with LDAP to LDAP data transfer? (Note: LDIF parser is a function within SDI)
- Second, SDI encode the attribute to base64 just because of a trailing space? Really? I don't believe so, because I read many address data with trailing space and my address data does not automatically changed to base64 data.
So, there might be something else causing your situation.
Rgds. YN.
Ab
2018-06-05 15:56:44 UTC
Permalink
Raw Message
Hi Yn,

Thanks for your reply.
"The LDIF Parser correctly parses and writes MIME BASE64 encoded strings: it tries to perform BASE64 encoding if necessary. One such situation is where there are trailing spaces after attribute values: to make sure another LDIF Parser gets the space, it encodes the attribute as BASE64."
-Second: Well I don't know why this is happening which is why I am here, to get help from someone who knows better. I can certainly confirm that when an attribute has a trailing space and gets updated to IBM DS it is actually stored in base64 encoding. If I read it through SDI again I see it as string because it will convert it back. If I read it through say an openldap client I can see the base64 encoded string. If I remove the trailing space before an update operation I can see that it gets stored as string. Does it seem weird to me? Yes. Have I every noticed this happen before? Nope. I only thought to look at the trailing space because of the above link. Is this a "feature" I could probably do without? Probably. I'm sure there is good reason for it to do it though. But something is certainly encoding the strings with trailing spaces and its not me doing it explicitly.
My assemblyling isn't doing anything complicated - Connector A in iterator mode (connects to LDAP server A) retrieves data and maps all attributes to work. Connector B in update mode (connects to LDAP Server B) and maps all work attributes out. If there is additional information you think might be helpful let me know.

This might just have to be one of those where I cannot right a quick one time assemblyline but will have to explicitly define attributes and trim the ones that are strings with trailing spaces. Unless someone has a better solution which will probably help me down the line as well.

https://www.ibm.com/support/knowledgecenter/en/SSCQGF_7.1.0 /com.ibm.IBMDI.doc_7.1/referenceguide73.htm

Thanks for taking the time to reply YN.

-Ab.
Hi there,
I wonder there is a missing info here, because the story does not add up.
- First, the LDIF parser in the title. What LDIF parser has anything to do with LDAP to LDAP data transfer? (Note: LDIF parser is a function within SDI)
- Second, SDI encode the attribute to base64 just because of a trailing space? Really? I don't believe so, because I read many address data with trailing space and my address data does not automatically changed to base64 data.
So, there might be something else causing your situation.
Rgds. YN.
yn2000
2018-06-05 18:11:22 UTC
Permalink
Raw Message
Very interesting...

#1. LDIF Parser topic: Yes, the SDI developer might have the same logic when building the connector, but mostly a connector job is to read the data, put in the 'conn' entity, and then map it to 'work' entity, and that is it. It is too costly (processing cycle wise) to perform any parser in between. Plus, my guess, there will be many SDI developers where the one who develop the LDAP connector is not the same person who develop File connector, where LDIF Parser code reside. (Just saying... :-)

#2: "...If I read it through say an openldap client I can see the base64 encoded string..." Is this how you prove it? What if openldap client is the one that represent the data incorrectly? Have you try using openldap client to read the original/source LDAP repository? Have you try using other LDAP Browser or other LDAP client?

How about checking data on the fly, within SDI, where you add a component called Dump Work Entry (script), before sending the data to LDAP target?

// Dump the work entry
task.dumpEntry(work);

Rgds. YN.
Ab
2018-06-05 19:46:54 UTC
Permalink
Raw Message
1. Sure. It is describing the exact same behavior that I am seeing, but see your point and it could certainly be unrelated.

2. Yes, I did read the source directory and it is a string there. The destination goes back to string as well after I trim all strings before updating it. You may be correct it may be the openldap client I have not tried others, that is something for me to check.

3. With in SDI I always see string values. I never see encoded values at any stage. It seems to encode/decode on the fly similar to how it will handle the changes in a changelog connector.

I ended up filtering through the attributes and removing any trailing spaces and then updating the destination. Which seems to have solved my problem.

Thanks for taking the time to help me with this.

-Ab
Post by yn2000
Very interesting...
#1. LDIF Parser topic: Yes, the SDI developer might have the same logic when building the connector, but mostly a connector job is to read the data, put in the 'conn' entity, and then map it to 'work' entity, and that is it. It is too costly (processing cycle wise) to perform any parser in between. Plus, my guess, there will be many SDI developers where the one who develop the LDAP connector is not the same person who develop File connector, where LDIF Parser code reside. (Just saying... :-)
#2: "...If I read it through say an openldap client I can see the base64 encoded string..." Is this how you prove it? What if openldap client is the one that represent the data incorrectly? Have you try using openldap client to read the original/source LDAP repository? Have you try using other LDAP Browser or other LDAP client?
How about checking data on the fly, within SDI, where you add a component called Dump Work Entry (script), before sending the data to LDAP target?
// Dump the work entry
task.dumpEntry(work);
Rgds. YN.
yn2000
2018-06-05 23:07:35 UTC
Permalink
Raw Message
I am glad it works for you.
It's SDI... many ways to skin the cat.
Rgds. YN.
Eddie Hartman
2018-06-06 08:18:24 UTC
Permalink
Raw Message
Post by Ab
1. Sure. It is describing the exact same behavior that I am seeing, but see your point and it could certainly be unrelated.
2. Yes, I did read the source directory and it is a string there. The destination goes back to string as well after I trim all strings before updating it. You may be correct it may be the openldap client I have not tried others, that is something for me to check.
3. With in SDI I always see string values. I never see encoded values at any stage. It seems to encode/decode on the fly similar to how it will handle the changes in a changelog connector.
I ended up filtering through the attributes and removing any trailing spaces and then updating the destination. Which seems to have solved my problem.
Thanks for taking the time to help me with this.
-Ab
Post by yn2000
Very interesting...
#1. LDIF Parser topic: Yes, the SDI developer might have the same logic when building the connector, but mostly a connector job is to read the data, put in the 'conn' entity, and then map it to 'work' entity, and that is it. It is too costly (processing cycle wise) to perform any parser in between. Plus, my guess, there will be many SDI developers where the one who develop the LDAP connector is not the same person who develop File connector, where LDIF Parser code reside. (Just saying... :-)
#2: "...If I read it through say an openldap client I can see the base64 encoded string..." Is this how you prove it? What if openldap client is the one that represent the data incorrectly? Have you try using openldap client to read the original/source LDAP repository? Have you try using other LDAP Browser or other LDAP client?
How about checking data on the fly, within SDI, where you add a component called Dump Work Entry (script), before sending the data to LDAP target?
// Dump the work entry
task.dumpEntry(work);
Rgds. YN.
Just a couple of things: first off, the LDAP Connector uses no Parser - instead it uses JNDI calls to read objects (entries with attributes) from the directory, or write them to it. The LDIF Parser is used by the Changelog Connectors, since the 'changes' attribute in a Changelog Entry is encoded as incremental LDIF.

As to the Base64 encoding, no the Connector does not do this. Could it be the target LDAP you are using does..? And could it be what you are 'seeing' as trailing spaces might actually be unprintable characters and that this causes the encoding.

Finally, the UserFunction class (the 'system' object) provides encode/decode Base64 functions, in case you need 'em.

/Eddie

Loading...