,So I have some knowledge on this issue and had intended to assist with documentation (and did inform some relevant parties) but not had the time to sit down and write any kbs on this yet.
For larger Objects (e.g. those that are auto-striped due to being larger than the max component size of 255GB) there is a relatively significant change in component layout when updating to Object format v13 - this change of layout is actually what enables our now significantly lowered guidance on slack-space requirements for Storage Policy reconfigurations as this changes how Objects are rebuilt when doing any Storage Policy change that requires a 'deep-reconfig' (e.g. whole new layout of the Object before removal of the previous components, e.g. changing from stripe-width=1 to stripe-width=2 or from RAID1 to RAID5).
But, in order to facilitate these changes, all Objects need to undergo one last reconfiguration that actually requires the same space as if they were doing a deep-reconfig e.g. a 1TB vmdk (2TB FTT=1,FTM=RAID1 physically used on disk with, assuming full or thick-provisioned for clarity sake) will require 2TB of space across viable Fault Domains (e.g. in a 2-node cluster, 3TB free on one node and 500GB free on the other will not suffice) to perform this reconfiguration.
This process can be problematic in certain scenarios e.g. a recent one I came across where there was an iSCSI target that was basically consuming more than half of the available storage of a 2-node cluster - performing a deep-reconfig of such an object without as much space as it consuming (and relative to the datastore size) is like asking a semi-truck to turn around in a normal sized house's driveway.
"This morning, I still have 1 object left to resync, with 568 days left, and climbing. ??"
From engineering communication that I am aware of, if there is not sufficient space (and in applicable Fault Domains) this is intended to timeout instead of such behaviour, perhaps this requires further tooling.
"How can I fix this?"
The Object in question can be fairly easily identified as it will be the only one on a lower Object format:
# esxcli vsan debug object list -all > /tmp/objout
And then just run the following to see what version the recalcitrant Object is (e.g. it should be the only one not v13 and we are not going to guess what version it is):
# grep Version /tmp/objout | sort | uniq
Then either less the /tmp/objout output file and search the file for the Object in question e.g. 'ESC' '/' Version: XX (use whatever version it is from the above that is not Version: 13) or use grep against it (e.g. grep "Version: XX" -B1 -A200) to determine the identity and size (and used size) of the Object.
Once it has been identified, (assuming it is just a space issue) it should be more clear how much space is required to reconfigure this Object (and accounting for the fact that we don't resync anything that pushes a disk >95%), this may be a case of deleting unneeded test/detritus, consolidating snapshots, temporarily moving something off this cluster or if possible (e.g. have backups) temporarily FTT=0 this Object or another large Object (NOTE, THIS IS ONLY APPLICABLE IF IT IS CURRENTLY FTM=RAID1) so that there is enough space to perform this one last true deep-reconfig.