I have received some errors lately that basically indicated bad JCL codification at first, however there were different results on different systems for same situationBasically users are submitting jobs like this:
//MM141KIE JOB (SM00,MM141K,99,99),'LIST',
//STEP2 EXEC PGM=IEBGENER,COND=(0,NE)
//SYSPRINT DD SYSOUT=*
//SYSUT1 DD DISP=SHR,DSN=MM141K.TEST.TAPEMNT,VOL=SER=******
//SYSUT2 DD DSN=&&TEMP1,DISP=(,PASS),UNIT=VTSTAPE,
//SYSIN DD DUMMY
It is a simple JCL but SYSUT1 file is a multi-file dataset, it has actually 7 volumes, and just the first volser was coded in JCL.
It was not cataloged, so basically I supposed that the 15 volumes should be coded in JCL or job would fail, or CA1 would use it's volume chain information to mount all them
That's what seemed to be happened, job started to run, and at that time I thought that it was because CA-1, having all volume chain information, would be able to mount all volumes
However only 6 firsts tapes were mounted, and it ended with ABEND CODE S637 and following message is issued:
IEC026I 637-54,IFG0554J,MM141KIE,STEP2,SYSUT1,1E59,, 869
At first I though it was related to exit CBRUXVNL, then disabled it and reran a test job reproducing same situation, but got same result
How and What allows system to mount first 6 volumes but not the others ? Were 6 volumes mounted because of information on TMC ? If yes, why didn't it continue to mount?
Need to know what system component determine the different results.
note* Don't know if it might be the reason but I could run same job in another system with CA-1 v14, it ran fine, mounted all volumes without any problem. Both have TS7700 Tape Subsystems.
Correting the following phrase: I supposed that the 15 volumes should be coded in JCL or job would fail
I meant: I supposed that the 7 volumes should be coded in JCL or job would fail
Let me give you the 50 cent tour of IBM tape allocation and where CA-1 fits in.
1). All allocation of tape volumes is done via the z/OS operating system. This is provided via SMS/OAM and JCL. CA-1 is not part of the actual allocation of tape volumes. Our first real 'hook' into the operating system is after the tape has been mounted and the reading of the IBM standard tape labels.
2). The operating system need to know the attributes of the DSN you are trying to read in. Since this DSN was not cataloged in the OS/Catalog, we need to provide the information via the JCL. Had the DSN been cataloged you would just need to provide the DSN and the DISP parameter in the JCL.
3). Reading in the 15 volume series when the DSN is not cataloged:
//SYSUT1 DD DISP=SHR,DSN=MM141K.TEST.TAPEMNT,
4). Reading in a 2nd file on a tape volume. Since the information is not in the OS/Catalog, we must provide it in the JCL.
5). Reading in a DSN that is in the OS/Catalog:
//SYSUT1 DD DISP=SHR,DSN=MM141K.TEST.TAPEMNT
A good habit to get into is to always catalog your DSN when possible. IBM has made some changes in the way they handle an output volume series with more than 5 volumes. Before z/OS 2.1, you had to code a VOL=(,,99) in the JCL if the job created more than 5 volumes. You would see an ABEND 837-08 on the output DD. With z/OS 2.1 and above, you no longer need to add the VOL parameter. The issue was the IBM JFCB control block only had room for the first 5 volumes.
Give the JCL a test run and let me know if you have any more questions on this.
Robert Thank you for your reply. I understand what you say and agree with your statement
What you mentioned above on topics 1 2, and 3, was what I meant on my original post, that's why I named this discussion as "Incomplete VOL=SER statement on JCL", and mentioned that JCL would be incomplete, but appreciate your reminder here. Remembering that the JCL above is just an example of what our user has coded, the point here is how system is able to mount 6 volumes even not having DSN cataloged neither having coded [VOL=SER=(000001,000002,000003,000004,000005,000006)], considering that CA1 has nothing with it during the mount.
Regarding z/OS version, both systems are at same level, z/OS 2.1.
Check this out:
This is a real test case on a test system. DSN= AB969F.TESTE.TAPE
This is the list of volumes it is currently allocated
1STVOL = X25574
This is the JCL to copy it. Note that only first volume was coded:
//COPYDSN JOB (SM00),'COPY_DATASET',MSGCLASS=M,NOTIFY=MM141K,
//STEP1 EXEC PGM=IEBGENER
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
//SYSUT1 DD DSN=AB969F.TESTE.TAPE,UNIT=VTSTAPE,
//SYSUT2 DD DSN=&&TEMP1,DISP=(,PASS),UNIT=VTSTAPE
TAPE MOUNTED ON 1914,X25574,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X09651,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X25736,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X09847,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X25966,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X16553,COPYDSN ,STEP1
TAPE MOUNTED ON 1914,X23374,COPYDSN ,STEP1
IEC205I SYSUT2,COPYDSN,STEP1,FILESEQ=1, COMPLETE VOLUME LIST,
IEF234E K 1914,X23374,PVT,COPYDSN,STEP1
Job mounted all 7 volumes and completed with RC=0000
*This was a test that ran on first system at z/OS 2.1
Now same JCL and same situation in production system at z/OS 2.1:
//SYSUT1 DD DISP=SHR,DSN=AB969F.TESTE.TAPE,VOL=SER=2M4531
TAPE MOUNTED ON 1E40,2M4531,MM141KIE,STEP2
TAPE MOUNTED ON 1E40,0O5126,MM141KIE,STEP2
TAPE MOUNTED ON 1E40,2M4771,MM141KIE,STEP2
TAPE MOUNTED ON 1E40,1O8786,MM141KIE,STEP2
TAPE MOUNTED ON 1E40,1O9112,MM141KIE,STEP2
TAPE MOUNTED ON 1E40,2M5379,MM141KIE,STEP2
Then instead of mounting 7th volume following message is issued and it ends up with 637-54
For some reason it looks like that volser is not being gotten correctly. By message IEC501A, it requests a mount for volser ' 1O'.
So basically, I want to understand first. Considering we are correct with the fact that we must code all volumes at JCL, or DSN must be cataloged, HOW first test passed with RC=0000 and it did the copy without any issue?And the other question is, what may be different between both systems ?
Something doesn't look right on this. Do me a favor and send me your Site ID to email@example.com and I'll open a issue for you.
Will do this BOB.
Just something important that I didn't mention: messages IEC710I and IEC716I
Excepting 1st volume which is coded on JCL, before every tape mount those two messages are issued indicating that the volume list has been corrected.
*I think that it might be an exit point which is intercepting it.
There have been some changes with regard to the way modern z/OS works and CA 1 interacts with it. Starting a couple of releases ago (z/OS 1.13 or 2.1 I believe), IBM added the ability for the Label Anomaly exit to supply the next volume when an EOV (end-of-volume) is reached prior to an EOF (end-of-file). CA 1 added support for this new Label Anomaly exit call with RO63948. What this means is that if you supply only 1 volume of a multi-volume file, when the first EOV is reached the CA 1 supplied Label Anomaly exit will read the NXTVOL field and supply it to IBM. That is what you are seeing with the IEC716I message "multivolume list corrected". At this same time, IBM also eliminated the need for the VOL-COUNT parameter during the creation of a multi-volume file (so no more S837 type abends because a high volume-count file is being created).
Anyway, in the case of the S637 abend you are getting, I would want to see what the NXT-VOL field is set too in the last volume successfully mounted by the abending job. It sounds like there might be some type of chainging error within the TMC multi-volume chain. If the list of volumes is NOT supplied via JCL (or MVS/Catalog), then it will be supplied by CA 1 one volume-at-a-time. If there is a chaining error in the CA 1 multi-volume chain; that will cause an abend and is most likely the cause of your problem.
CA 1 Architect
Russel, thank you for your contribution here.
Note that it is trying to get the next vol, however it seems to get an offset issue:
VOLUME ' 1O' does not exist. So that's why it is taking the abend anyway
That is why I said you need to validate that the PREVIOUS volume has a good NXTVOL field in it. For example, a TMSBINQ of the last successfully mounted volume would show the NXTVOL field. Is "10" the last 2 digits of the NXTVOL? Or the first 2 digits of the NXTVOL? Or was NXTVOL manually updated and only contains a "10"? That would be seen by looking at the TMC record of the last successfully mounted volume. And TMSBINQ will display that quite nicely.
Ok, since this test case has been executed last week and I had used 1 day as RETPD, it was scratched already, but created a new one. Here is the allocation for 6th volume:
IECSWBT TAPE MOUNTED ON 13B9,604927,MM141KIE,STEP2
Here is the TMC list for last volume mounted:
1STVOL = 649210 NEXTVOL= 652456 PREVVOL= 650948 PRERRC = 00000
See that it seems to be an offset issue. It takes ' 65' and it should be 652456
Please open an issue with CA 1 support and supply them with a TMSBINQ report of volume 2M5379 so that we can validate what it has for its NXTVOL field.
Okay, I REALLY think you need to contact support and open an issue with CA 1 support. This appears to be a different volume-chain then the previous one you mentioned. So, the problem has now happen'ed with 2 different multi-volume chains.
In the first example, you indicated it was the 7th volume that failed. But in a subsequent test you read a 7-volume chain just fine. What volseq was 652456? Because you are only supplying small little snippets of documentation here, this is proving impossible to completely diagnose. That is why it would be so much better to open a support ticket and send in complete documentation.
Russel, again, I really appreciate your inputs here, it's being very helpful indeed, but I just opened a discussion topic for a simple discussion with other technicians. I don't have anything drastically impacting my systems to open a case for now. I initially opened this topic cause I thought that it would be a productive discussion on this community and could get the answer for my concern at same time.
Answering your last question, there are different chains because same test are being done on different systems (TEST / PROD) but all them were labeled correctly here to not cause misunderstanding. I described what is test, what is production, what I have re-created to in order to provide you an example of what you requested from TMSBINQ. So If the only way to talk about it is opening a case, that's OK for me I can do this, but it was not my best option for a collaborative work.
Sorry If I am not being clear in my statements, I am trying to be the clearer I can.
Here is the new chain I created to show you the whole thing:
DSN = MM141K.TEST.VOLPARM
IECSWBT TAPE MOUNTED ON 13B9,649210,MM141KIE,STEP2
IECSWBT TAPE MOUNTED ON 13B9,650550,MM141KIE,STEP2
IECSWBT TAPE MOUNTED ON 13B9,650642,MM141KIE,STEP2
IECSWBT TAPE MOUNTED ON 13B9,650774,MM141KIE,STEP2
IECSWBT TAPE MOUNTED ON 13B9,650948,MM141KIE,STEP2
IECSWBT TAPE MOUNTED ON 13B9,604927,MM141KIE,STEP2
Here is TMC info for last volume mounted
Okay, I understand a little better now. To be honest, I was looking at the code now and see we simply do a GET-VOLUME on the previously mounted volume (obtained from the IBM control blocks passed to the Label-Anomaly exit) and if the NXTVOL is not all blanks or hex-zeros we simply move the NXTVOL into the Label-Anomaly exit parameter list. We do not look at the volume-sequence and do different processing based on any volume-sequence. It is just a simple 6-character move. Now, the 7th volume failure might be something within the operating system (a bug in the Label-Anomaly exit processing) and might be dependant on the release and maintenance level. In your tests, it has failed twice on the 7th volume and also been successful with a 7-volume set. Did all 3 tests run on the same LPAR? If they ran on different LPAR's, are they at the same maintenance level?
As far as the basic support for the IEC710I and IEC716I and changes to the Label-Anomaly exit (and the elimination of the need for a Volume-Count to be specified when creating the file) I would recommend you review the changes IBM documents for tape processing. The elimination of the S837 and the volume-count specification is long overdue and a very nice enhancement.
Russel I did same test in 3 different systems. 1 Production and 2 different test systems.
All them have z/OS 2.1 and the only system that test passed ok, was the one with CA-1 14.0. It might be just a coincidence, but it's the only different I am aware of.
CA-1 14.0 was installed in this system 1 week ago and there was no EXIT customization or change on the exits during installation. Unfortunately I had not tested it before the new release to be installed, so I can't confirm that when we were at 12.6 it was working(or not working) like the others.
Just giving a new status here. It seems to be a thing related to CA-1 v12.6.
As reported above, I have ran same test in different systems, basically 3 different systems(1 production and 2 different test systems) at same z/OS level but different CA-1 versions. My Prod system have CA-1 12.6 and has PTF RO63948 (which should fix this Label Anomalies) applied. And my Test systems have different versions, system A has CA-1 v14.0 and system B has CA-1 12.6. As presented above, the only system that completed the whole Volume Chain and resolved the LABEL Anomalies was the one with CA1 v14.0.
In order to test another situation, I went to a 4th system which has RMM instead of CA-1. I did the same test, now with 10 volumes chained and completed as expected:
IECSWBT TAPE MOUNTED ON 1405,SC3993,COPYDSN ,STEP1
IEC710I 1405,SC3993,COPYDSN,STEP1,SYSUT1 ANOTHER VOLUME EXPECTED
IEC716I SYSUT1 :TAPE MULTIVOLUME LIST CORRECTED
IEC501A M 1405,SC8739,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SC8739,COPYDSN ,STEP1
IEC501A M 1405,SE3325,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SE3325,COPYDSN ,STEP1
IEC501A M 1405,SC1594,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SC1594,COPYDSN ,STEP1
IEC501A M 1405,SE8349,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SE8349,COPYDSN ,STEP1
IEC501A M 1405,SC5499,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SC5499,COPYDSN ,STEP1
TAPE MOUNTED ON 1405,SC2958,COPYDSN ,STEP1
IEC501A M 1405,SE2844,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SE2844,COPYDSN ,STEP1
IEC501A M 1405,SE3062,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SE3062,COPYDSN ,STEP1
IEC501A M 1405,SB1977,SL,COMP,COPYDSN,STEP1,MM141K.TEST.VOLPARM2
IECSWBT TAPE MOUNTED ON 1405,SB1977,COPYDSN ,STEP1
$HASP901 COPYDSN MAXCC(0000) INFUSR=NO
That indicates an inconsistency on CA-1 12.6, even having the RO63948 applied. I will open a case at support.ca.com in order to possibly have it addressed