assertion
Full identifier: https://w3id.org/np/RAPwHYQQtXh6p3DQQ066TmpKOBMIWkerAYv-chCViAqC0#assertion
Minted in Nanopublication
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://purl.org/dc/terms/creator
creator
https://w3id.org/np/RAoSadUw99...
Leshem Choshen 🤖🤗 @ICML wanna talk?
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://purl.org/spar/cito/discusses
discusses
https://arxiv.org/abs/2311.13171
2311.13171
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://purl.org/spar/cito/discusses
discusses
https://www.alphaxiv.org/pdf/2408.03092
2408.03092
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://purl.org/spar/cito/discusses
discusses
https://x.com/prateeky2806/status/1727589818618523783
status/1727589818618523783
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://purl.org/spar/cito/includesQuotationFrom
includesQuotationFrom
https://x.com/prateeky2806/status/1727589818618523783
status/1727589818618523783
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
http://www.w3.org/2000/01/rdf-schema#comment
comment
(this is a literal)
" Merging models trained for long with WIDEN
When models were trained on a lot of data they diverged further from the baseline (e.g. in continual pretraining for additional languages), current merging methods underperform in this setting
https://alphaxiv.org/pdf/2408.03092
@AlibabaGroup https://twitter.com/LChoshen/status/1823002789217493392/photo/1
How do you do that?
Let's assume we update a matrix with a few models.
Pick a pretrained model and consider the rest of the models as diff from it (task vectors)
Normalize the row of each model, separating the normalization factor (magnitude) and direction (row)
Now we weigh every row by how much it changed (higher = better) and average all together
+ some trick to sometimes keep the original weight so weights might not sum to 1.
You can see how this follows recent findings about direction and size (e.g. https://x.com/prateeky2806/status/1727589818618523783)
While the results in "just" merging are not changing that much, merging with a continually trained model (Sailor) that added many languages look quite good! https://twitter.com/LChoshen/status/1823002796259791276/photo/1
Criticism (@askalphaxiv didn't upload comment):
There is a vast overclaiming calling Sailor a different pretrained model.
Quite complex, hard to know if it will generalize
and they only show a specific model.
"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
https://schema.org/keywords
keywords
(this is a literal)
"Sailor"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
https://schema.org/keywords
keywords
(this is a literal)
"WIDEN"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
https://schema.org/keywords
keywords
(this is a literal)
"large-language-models"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
https://schema.org/keywords
keywords
(this is a literal)
"model-merging"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
this assertion
https://schema.org/keywords
keywords
(this is a literal)
"weight-disentanglement"
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
The assertion above
http://www.w3.org/ns/prov#linksTo
linksTo
https://x.com/LChoshen/status/1823002789217493392
status/1823002789217493392
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
The assertion above
http://www.w3.org/ns/prov#wasAssociatedWith
wasAssociatedWith
https://x.com/LChoshen
LChoshen
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
The assertion above
http://www.w3.org/ns/prov#wasAttributedTo
wasAttributedTo
https://orcid.org/0000-0002-0085-6496
0000-0002-0085-6496
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
The assertion above
http://www.w3.org/ns/prov#wasAttributedTo
wasAttributedTo
https://w3id.org/np/RAoSadUw99...
Leshem Choshen 🤖🤗 @ICML wanna talk?
.
This is the identifier for the assertion of this nanopublication.
https://w3id.org/np/RAPwHYQQtX...#assertion
The assertion above
http://www.w3.org/ns/prov#wasGeneratedBy
wasGeneratedBy
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#activity
activity
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#activity
activity
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
type
https://sense-nets.xyz/supervisedActivity
supervisedActivity
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#activity
activity
http://www.w3.org/ns/prov#wasAssociatedWith
wasAssociatedWith
https://sense-nets.xyz/
Sensenets
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
This nanopublication
http://www.w3.org/2000/01/rdf-schema#label
has the label
(this is a literal)
"CoSMO Semantic Post"
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
This nanopublication
http://purl.org/nanopub/x/hasNanopubType
has the type
https://sense-nets.xyz/SemanticPost
SemanticPost
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
This nanopublication
date and time when the nanopublication was created
http://purl.org/dc/terms/created
was created on
(this is a literal)
"2024-09-03T21:16:16.131Z"
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://purl.org/nanopub/x/hasSignatureTarget
has as target
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
this nanopublication
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://purl.org/nanopub/x/hasAlgorithm
has the algorithm
(this is a literal)
"RSA"
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://purl.org/nanopub/x/hasPublicKey
has the public key
(this is a literal)
"MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB"
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://purl.org/nanopub/x/hasSignature
has the signature
(this is a literal)
"BMCHmxj4685c4tB4MzssQlbmilVpyC5oQEPuiEqc4AHbLlU0uJStQhpua7d52ZKIDFMi9nmrvLJc7eFuYs6gyjJzve0WY5BNHdpurTkJeU3Tyh9G2vsmlVof2FQc6QaijFR5DFKECKems3CSMJuBxChDj+hqrjS6DloVTdEIEalSHXsOw0utP7P/ZZvdhvkTMYaPPhuJspFjyGYmfLVb/m+Gr2zlsQgXRxdS5qc8LvGdAAjRxS4LAwzk7rklJXEfyDEWZ+B9V5hPzsmmqb60iFPaA9PHyqFGUT+EP1WFyJdIVL5PS48izFWx0+KDaTH4Nm6JrQUSO8kNx348rgKYZA=="
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
This nanopublication
http://purl.org/nanopub/x/wasCreatedAt
was created at
https://sense-nets.xyz/
Sensenets
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://purl.org/nanopub/x/singedBy
singedBy
https://sense-nets.xyz/
Sensenets
.
This is a local identifier minted within the nanopublication.
https://w3id.org/np/RAPwHYQQtX...#sig
sig
http://www.w3.org/ns/prov#wasAssociatedWith
wasAssociatedWith
https://w3id.org/np/RAoSadUw99...T86OrTzk16VtssigningDelegation
RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16VtssigningDelegation
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
this nanopublication
http://purl.org/dc/terms/creator
creator
https://w3id.org/np/RAoSadUw99...
Leshem Choshen 🤖🤗 @ICML wanna talk?
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
this nanopublication
http://purl.org/dc/terms/license
license
https://creativecommons.org/licenses/by/4.0/
4.0
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
this nanopublication
http://www.w3.org/ns/prov#wasAttributedTo
wasAttributedTo
https://orcid.org/0000-0002-0085-6496
0000-0002-0085-6496
.
This is the identifier for this whole nanopublication.
https://w3id.org/np/RAPwHYQQtX...
this nanopublication
https://sense-nets.xyz/hasRootSigner
hasRootSigner
(this is a literal)
"0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2"
.
References
Nanopublication | Part | Subject | Predicate | Object | Published By | Published On |
---|---|---|---|---|---|---|
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Merging models trained for long with WIDEN
When models were trained on a lot of data they diverged further from the baseline (e.g. in continual pretraining for additional languages), current merging methods underperform in this setting
https://alphaxiv.org/pdf/2408.03092
@AlibabaGroup https://twitter.com/LChoshen/status/1823002789217493392/photo/1
How do you do that?
Let's assume we update a matrix with a few models.
Pick a pretrained model and consider the rest of the models as diff from it (task vectors)
Normalize the row of each model, separating the normalization factor (magnitude) and direction (row)
Now we weigh every row by how much it changed (higher = better) and average all together
+ some trick to sometimes keep the original weight so weights might not sum to 1.
You can see how this follows recent findings about direction and size (e.g. https://x.com/prateeky2806/status/1727589818618523783)
While the results in "just" merging are not changing that much, merging with a continually trained model (Sailor) that added many languages look quite good! https://twitter.com/LChoshen/status/1823002796259791276/photo/1
Criticism (@askalphaxiv didn't upload comment):
There is a vast overclaiming calling Sailor a different pretrained model.
Quite complex, hard to know if it will generalize
and they only show a specific model.
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
||
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
assertion
|
Sensenets
|
2024-09-03T21:16:16.131Z
|