[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ROCm bump to 5.7 - rocm-smi-lib SOVERSIONs bump



I'm reviewing the updates for rocm-smi-lib 5.7.0 in git and figured
out a issue. I'll first briefly write it down here and push the fix
later.

The patch contains an old copy of rsmiBinding.py. Probably due to
python import path reasons.
0001-Merged-python-rsmi-bindings-into-the-main-file.patch

It is outdated. Since the latest version of the rsmiBinding.py
file has been converted into a cmake template, involving placeholders
like @VERSION_MAJOR@. We should no longer include the 0001-* patch,
as its contents are already outdated.

For instance, take a look at the structure rsmi_status_t. The latest
version of the .in file contains status definition for 0x13, but the
oudated patch does not.

My proposed fix is to simply remove the 0001-* patch, and move
the rsmiBinding.py to the standard python library path
/usr/lib/python3/dist-packages/
So the `from rsmiBindings import *` will work.
Working on this now.

Incorporating a moving target into a batch is prone to be oudated.

On Sun, 2023-09-17 at 11:25 +0200, Étienne Mollier wrote:
> Hi all,
> 
> I begun to have a look at the ROCm SMI library[1].  So far I see
> the rocm_smi64 library now has an SOVERSION 5 upstream and the
> oam library is still at SOVERSION 1.  With the changes in build
> system since our rocm-smi-lib is still at ROCm 5.2.3, and
> reviewing the symbols table, disregarding symbols marked
> optional, there is one change I spotted, that is the change of
> signature of the shared_mutex_init function from the following
> in ROCm 5.2.3:
> 
> 	shared_mutex_t shared_mutex_init(const char *name, mode_t
> mode);
> 
> to the following in ROCm 5.7.0:
> 
> 	shared_mutex_t shared_mutex_init(const char *name, mode_t
> mode, bool retried=false);
> 
> If my little understanding of the C++ is correct, the signature
> change with the default value applied won't break the API.  What
> I'm not sure of however, is whether this will break the ABI.  In
> terms of symbols table, the change looks like.
> 
> 	- _Z17shared_mutex_initPKcj@Base 4.5.2
> 	+ _Z17shared_mutex_initPKcjb@Base 5.7.0
> 
> Or through c++filt:
> 
> 	- shared_mutex_init(char const*, unsigned int)@Base 4.5.2
> 	+ shared_mutex_init(char const*, unsigned int, bool)@Base
> 5.7.0
> 
> This symbol change is leaked from the third party shared_mutex
> library, which has probably seen a version bump in the middle.
> It is present in the rocm_smi64 library, but also in the oam
> library which has /not/ seen an SOVERSION bump upstream.
> 
> I'm not certain this symbol is supposed to be exposed by these
> libraries anyway, but I guess it wont hurt to be on the safe
> side.  Is it okay if I align the rocm_smi64 SOVERSION to 5, and
> also bump the oam SOVERSION to, say, 2, to reflect this
> signature change?
> 
> [1]: https://salsa.debian.org/rocm-team/rocm-smi-lib
> 
> Have a nice day,  :)


Reply to: