IBM Mainframe Forum

by **steve-myers** » Mon Aug 19, 2013 12:53 am

IBM mainframe Assembler programmers have been using code like this for almost 50 years to convert binary data to hexadecimal digits.

Select all

UNPK OUTPUT(9),INPUT(5)
TR OUTPUT,HEXTAB
...
INPUT DC X'A0B1C2D3',X'12'
OUTPUT DC CL8' ',C' '
HEXTAB EQU *-C'0'
DC C'0123456789ABCDEF'

Very few of us actually figured this out by ourselves - I'm not one of them - but were shown the method by others or picked it up by reading other programmer's code. It seems almost magical.

Hardly.

UNPK

The first sentence in the description of the UNPK instruction reads

The format of the second operand is changed from packed to zoned, and the result is placed at the first-operand location.

Most of us think we have a pretty good understanding of packed decimal data, and most of us have a vague understanding of zoned decimal data. If we look at the data the first operand of the UNPK instruction points to, it is immediately obvious it is not packed decimal data!

Part of what's happening lies in this nearly unnoticeable sentence later on in the description of UNPK. "The sign and digits are not checked for valid codes." Huh!? What is this d****** thing doing!? Also, the UNPK instruction is listed in the section of the manual labeled "General Instructions," rather than the "Decimal Instructions" section. In the original System/360 this was an important distinction. In the lower end machines the decimal instructions were an optional feature, and IBM charged a hefty fee to have them in the machine. The reality is UNPK is not a "decimal" instruction. I think of it as a bit twiddler instruction.

Look at the data carefully. You will notice the input contains 4 bytes followed by a mystery byte, and the output contains 8 bytes, followed by a mystery byte, and the UNPK instruction deliberately includes the mystery. Now let's dredge up our understanding of packed decimal data. Most of a packed decimal data area consists of 4 bit digits containing bit codes 0000 (0) through 1001 (9). The first 4 bits in the last byte contain a 4 bit digit code followed by a 4 bit sign code containing bit codes 1010 through 1111. So it would seem the mystery byte in our binary data is the last byte of a pseudo packed decimal data area. So, we're going to do something or other with this byte, but it's not really going to be part of our real output.

If we work through the description of UNPK in Principles of Operation - it's pretty hard going - we will find that except for the last byte, each input byte creates two output bytes. The first 4 bits of each byte are 1111, and remaining 4 bits are from the data area. With our sample data, the first two bytes in the output area contain X'FAF0'.

TR

The first paragraph of the description of the TR instruction reads

The bytes of the first operand are used as eight-bit arguments to reference a list designated by the second-operand address. Each function byte selected from the list replaces the corresponding argument in the first operand.

This is pretty opaque, especially given the way HEXTAB is defined in our example. The third and fourth paragraphs in the description provide more detail, but they are still pretty opaque.

So let's try to make things clearer.

Not stated, but it can readily be deduced, the table referenced by the second argument contains 256 bytes.
The HEXTAB label in our example program defines what amounts to a virtual start of the table. We can do this because there is no data in the first operand that references relative bytes X'00' through X'EF' in the table.
The first byte in the output area references the byte at HEXTAB+X'FA', which if you do your arithmetic, turns out to be C'A', so the instruction replaces X'FA' with C'A'.
The second byte in the output area is X'F0', so it references the byte at HEXTAB+X'F0', which is C'0', so the instruction replaces X'F0' with C'0'. You can figure out what the remaining bytes become yourself.

Now that's not so magical, is it? It is hard to understand, but if you are patient you can figure this out pretty easily.

by **BillyBoyo** » Mon Aug 19, 2013 1:19 pm

Thanks very much for these Steve.

Here is an example of this particular technique in use, from IBM system software: http://cd.textfiles.com/pcsig08/801_900 ... GC1013.ALC

Not as nicely-written as yours :-)

I've only really seen the COBOL side in action, and because the technique allows for up to 16 digits to be translated at once, it is much faster than other attempts. Having not got to it myself in COBOL, but taken it from others (with Assembler backgrounds) I never thought deeply enough about it.

Now your explanation gets me to a question.

In my earlier methods to "unhex" in COBOL, I never tried to change the 0-9 digits after whatever translation, as they were already by that time what I wanted (F0, F1, F2 etc), but the FA, FB, FC etc needed "correcting".

Now, from your explanation, in the TR X'F0' is getting to C'0'. But X'F0' is already C'0'.

I'll try the COBOL with only 'ABCDEF' but can't think why it wouldn't work.

Have I missed something?

EDIT: Looked at the COBOL code I'm using: INSPECT ... CONVERTING X'FAFBFCFDFEFF' already :-)

Not my doing, I just copied it...

by **steve-myers** » Mon Aug 19, 2013 5:12 pm

Agreed The TR instruction is translating C'0' to C'0'. The goal is to translate X'FA' through X'FF' to C'A' through C'F' without altering or otherwise inspecting C'0' through C'9' and X''FA' through X'FF', and doing it quickly.

My recollection is I had deduced the UNPK piece, which may be the most challenging, on my own, but had to be shown the TR piece. This was in the spring of 1968; I was attempting to figure out this stuff at one end of a snail mail link over around 8000 miles from the east coast of the US to Taiwan.

by **BillyBoyo** » Mon Aug 19, 2013 5:24 pm

So, because the entire "data set" of possibile values consists only of those sixteen (0-9, A-F, each preceded by F), the C'0123456789' being present prevents the TR instruction trying to do anything with C'0123456789' other than translating each to itself. Without their presence, each byte would be looked at six times for the X'FAFBFCFDFEFF"?

by **steve-myers** » Mon Aug 19, 2013 6:13 pm

The max input length for the UNPK is 7 bytes. 8 bytes would require an output length of 17 bytes, which is too long for UNPK.

Select all

TXCNVT CSECT
USING *-256,12
SAVE (14,12),,*
LR 12,15
LA 0,256
SR 12,0
UNPK OUTPUT(L'OUTPUT+1),INPUT(L'INPUT+1)
TR OUTPUT,HEXTAB-C'0'
LA 0,L'OUTPUT
LA 1,OUTPUT
TPUT (1),(0),R
RETURN (14,12),T,RC=0
INPUT DC X'A0B1C2D3E4F506',X'00'
OUTPUT DC CL(2*L'INPUT)' ',X'00'
HEXTAB DC C'0123456789ABCDEF'
END TXCNVT

Try adding one more byte to INPUT.

by **BillyBoyo** » Tue Aug 20, 2013 12:38 am

Yes, seven bytes. Using eight bytes, the COBOL compile does two UNPKs. Seven bytes, just one UNPK. I didn't look at the generated code until now...

I may try this with longer fields. As you point out, the UNPK is just manipulating data. As long as I don't do any "calculations" I may be able to get more digits for one TR. One thing at a time though...

by **steve-myers** » Tue Aug 20, 2013 6:18 am

BillyBoyo wrote:... I may try this with longer fields. As you point out, the UNPK is just manipulating data. As long as I don't do any "calculations" I may be able to get more digits for one TR. One thing at a time though...

Yes, 256 max.

IBM Mainframe Forum

Convert binary to hexadecimal digits

Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits

Re: Convert binary to hexadecimal digits