Upload README.md
Browse files
README.md
CHANGED
|
@@ -11,316 +11,6 @@ tags:
|
|
| 11 |
- dataset_size:10884622
|
| 12 |
- loss:MatryoshkaLoss
|
| 13 |
- loss:MultipleNegativesRankingLoss
|
| 14 |
-
widget:
|
| 15 |
-
- source_sentence: What was the average household size in the county?
|
| 16 |
-
sentences:
|
| 17 |
-
- "Forth Dimension Displays (ForthDD) is a British optoelectronics company based\
|
| 18 |
-
\ in Dalgety Bay, Fife, United Kingdom.\n\nCompany overview \n\nFounded in 1998\
|
| 19 |
-
\ as Micropix and known later as CRL Opto and CRLO Displays, ForthDD makes high\
|
| 20 |
-
\ resolution microdisplays and spatial light modulators (SLM). The microdisplays\
|
| 21 |
-
\ are used in near-to-eye (NTE) applications for the military training and simulation,\
|
| 22 |
-
\ medical imagery, virtual reality and high definition image processing industries.\
|
| 23 |
-
\ The SLMs are used for structured light projection in 3D optical metrology and\
|
| 24 |
-
\ 3D super resolution microscopy. Headquartered in Dalgety Bay, Scotland, ForthDD\
|
| 25 |
-
\ also operates sales offices in the United States, Germany and Japan, and a customer\
|
| 26 |
-
\ support office in Germany. Previously funded by venture capitalists, in January\
|
| 27 |
-
\ 2011 ForthDD was acquired by Kopin Corporation, a NASDAQ listed company based\
|
| 28 |
-
\ in Taunton, Massachusetts, USA.\n\nTechnology \n\nForthDD's microdisplays and\
|
| 29 |
-
\ SLMs are based on a proprietary, high-speed, ferroelectric liquid crystal on\
|
| 30 |
-
\ silicon (LCOS) platform, protected by a number of patents. For the generation\
|
| 31 |
-
\ of colour and greyscale, ForthDD's microdisplays use a process called Time Domain\
|
| 32 |
-
\ Imaging (TDI™). This process involves rendering the red, green and blue colour\
|
| 33 |
-
\ components which make up an image sequentially over time at high speed. This\
|
| 34 |
-
\ happens so fast that the human visual system integrates the components into\
|
| 35 |
-
\ a single, full colour image. This enables the microdisplays to use the same\
|
| 36 |
-
\ pixel mirror for all three colour components, and avoids the artifacts associated\
|
| 37 |
-
\ with sub-pixels.\n\nLCOS Technology History\n\nThe first LCOS device originated\
|
| 38 |
-
\ in 1973, followed by a development of a liquid-crystal light valve ten years\
|
| 39 |
-
\ later. It was not until 1993, that a microdisplay with a resolution sufficient\
|
| 40 |
-
\ for use as a display was reported by DisplayTech (now Citizen Finedevices).\
|
| 41 |
-
\ It was capable of full red–green–blue image generation, enabled by the use of\
|
| 42 |
-
\ a fast-switching ferroelectric liquid crystal.\n\nDuring the early part of the\
|
| 43 |
-
\ 21st century, many microdisplay manufacturers focused on applying the technology\
|
| 44 |
-
\ to rear-projection-based high-definition television (HDTV) systems. However,\
|
| 45 |
-
\ due to developments in the manufacturing process of large-panel Liquid Crystal\
|
| 46 |
-
\ Display Televisions (LCD TVs) and resulting drops in the cost of components,\
|
| 47 |
-
\ LCD based TVs matured into the more popular consumer choice. By late 2007 almost\
|
| 48 |
-
\ all microdisplay Rear Projection Television (RPTV) manufacturers had withdrawn\
|
| 49 |
-
\ their TVs from production.\n\nAs a result, a number of microdisplay manufacturers\
|
| 50 |
-
\ either disappeared completely or started working on other technologies. Some\
|
| 51 |
-
\ companies diversified, whilst others concentrated on a niche market instead.\n\
|
| 52 |
-
\nProducts \n\nForthDD is a supplier of microdisplays for Near-To-Eye (NTE) applications\
|
| 53 |
-
\ and spatial light modulators for fringe projection systems.\n\nForthDD supplies\
|
| 54 |
-
\ full colour, all digital QXGA (2048 × 1536), SXGA (1280 × 1024) and WXGA (1280\
|
| 55 |
-
\ × 768) microdisplays. These products are available as chipsets and board level\
|
| 56 |
-
\ based products.\n\nApplications \n\nForthDD's microdisplays are typically used\
|
| 57 |
-
\ in the following application areas: Training and Virtual Environments, Medical\
|
| 58 |
-
\ Systems and Electronic Viewfinders (EVFs). Later system developments have allowed\
|
| 59 |
-
\ ForthDD to enter markets such as 3D Optical Metrology and, using phase modulation,\
|
| 60 |
-
\ Super-resolution microscopy.\n\nTraining and Virtual Environments\n\nForthDD's\
|
| 61 |
-
\ microdisplays can be found in various training and simulation applications across\
|
| 62 |
-
\ military and civilian environments within devices such as virtual binoculars,\
|
| 63 |
-
\ monocular viewers and most commonly, immersive HMDs (for example, in NVIS HMDs).\
|
| 64 |
-
\ By using HMDs to immerse the user in the virtual 3D environment, different scenarios,\
|
| 65 |
-
\ which may be too dangerous or expensive to replicate in the real world, can\
|
| 66 |
-
\ be explored.\n\nMedical systems\n\nMicrodisplays can be used in high-end medical/surgical\
|
| 67 |
-
\ microscopes in order to either replace the optical image or overlay data on\
|
| 68 |
-
\ the image (e.g. an MRI scan). When combined with a microdisplay the microscope\
|
| 69 |
-
\ becomes a more powerful tool and permits users to navigate the desired surface\
|
| 70 |
-
\ in real time with a very high degree of accuracy. Other medical applications\
|
| 71 |
-
\ include viewing systems such as endoscopes.\n\nFilm and Television\n\nForthDD's\
|
| 72 |
-
\ microdisplays are used in Electronic Viewfinders (EVFs) for HD digital cinema\
|
| 73 |
-
\ cameras. ARRI uses ForthDD's technology in its EVFs.\n\n3D Optical Metrology\n\
|
| 74 |
-
\nForthDD's microdisplays are used for fringe projection and confocal inspection\
|
| 75 |
-
\ in non-contact surface quality inspection systems (for example, in Sensofar\
|
| 76 |
-
\ products).\n\nReferences\n\nExternal links \n Forth Dimension Displays\n\nDisplay\
|
| 77 |
-
\ technology\nLiquid crystal displays\nCompanies based in Fife\nCompanies established\
|
| 78 |
-
\ in 1998"
|
| 79 |
-
- "Unicoi County () is a county located in the U.S. state of Tennessee. As of the\
|
| 80 |
-
\ 2010 census, the population was 18,313. Its county seat is Erwin. Unicoi is\
|
| 81 |
-
\ a Cherokee word meaning \"white,\" \"hazy,\" \"fog-like,\" or \"fog draped.\"\
|
| 82 |
-
\n\nUnicoi County is part of the Johnson City Metropolitan Statistical Area, which\
|
| 83 |
-
\ is a component of the Johnson City–Kingsport–Bristol, TN-VA Combined Statistical\
|
| 84 |
-
\ Area, commonly known as the \"Tri-Cities\" region.\n\nHistory\n\nUnicoi County\
|
| 85 |
-
\ was created in 1875 from portions of Washington and Carter counties. Its first\
|
| 86 |
-
\ settlers had arrived more than century earlier but the population had been small.\
|
| 87 |
-
\ The county remained predominantly agrarian until the railroads were constructed\
|
| 88 |
-
\ in the area in the 1880s.\n\nDuring the 1910s, the Clinchfield Railroad established\
|
| 89 |
-
\ a pottery in Erwin, which eventually incorporated under the name, \"Southern\
|
| 90 |
-
\ Potteries.\" This company produced a popular brand of dishware, commonly called\
|
| 91 |
-
\ Blue Ridge China, which featured hand-painted underglaze designs. While the\
|
| 92 |
-
\ company folded in the 1950s, Blue Ridge dishes remain popular with antique collectors.\n\
|
| 93 |
-
\nIn 1916, a circus elephant, Mary, was hanged in Erwin for killing her trainer.\
|
| 94 |
-
\ Hanging was chosen as the method of execution since all available guns were\
|
| 95 |
-
\ believed inadequate for killing an elephant. The hanging was the subject of\
|
| 96 |
-
\ a book, The Day They Hung the Elephant, by Charles Edwin Price.\n\nPronunciation\n\
|
| 97 |
-
\nHear it spoken (Voice of Unicoi County Mayor Greg Lynch, 2010)\n\nGeography\n\
|
| 98 |
-
\nAccording to the U.S. Census Bureau, the county has a total area of , of which\
|
| 99 |
-
\ is land and (0.2%) is water. It is the fifth-smallest county in Tennessee\
|
| 100 |
-
\ by total area. The Nolichucky River, which enters Unicoi County from North Carolina,\
|
| 101 |
-
\ is the county's primary drainage.\n\nUnicoi County is situated entirely within\
|
| 102 |
-
\ the Blue Ridge Mountains, specifically the Bald Mountains (south of the Nolichucky)\
|
| 103 |
-
\ and the Unaka Range (north of the Nolichucky). Big Bald, which at is the highest\
|
| 104 |
-
\ mountain in the Balds, is also Unicoi County's high point. Traversed by the\
|
| 105 |
-
\ Appalachian Trail, the mountain is topped by a grassy bald, allowing a 360-degree\
|
| 106 |
-
\ view of the surrounding mountains.\n\nAdjacent counties\nWashington County (north)\n\
|
| 107 |
-
Carter County (northeast)\nMitchell County, North Carolina (east)\nYancey County,\
|
| 108 |
-
\ North Carolina (south)\nMadison County, North Carolina (southwest)\nGreene County\
|
| 109 |
-
\ (west)\n\nNational protected areas\nAppalachian Trail (part)\nCherokee National\
|
| 110 |
-
\ Forest (part)\n\nState protected areas\nRocky Fork State Park\n\nMajor Highways\n\
|
| 111 |
-
\nDemographics\n\n2020 census\n\nAs of the 2020 United States census, there were\
|
| 112 |
-
\ 17,928 people, 7,658 households, and 4,953 families residing in the county.\n\
|
| 113 |
-
\n2000 census\nAs of the census of 2000, there were 17,667 people, 7,516 households,\
|
| 114 |
-
\ and 5,223 families residing in the county. The population density was 95 people\
|
| 115 |
-
\ per square mile (37/km2). There were 8,214 housing units at an average density\
|
| 116 |
-
\ of 44 per square mile (17/km2). The racial makeup of the county was 97.96%\
|
| 117 |
-
\ White, 0.07% Black or African American, 0.25% Native American, 0.08% Asian,\
|
| 118 |
-
\ 0.03% Pacific Islander, 0.95% from other races, and 0.66% from two or more races.\
|
| 119 |
-
\ 1.94% of the population were Hispanic or Latino of any race.\n\nThere were\
|
| 120 |
-
\ 7,516 households, out of which 26.60% had children under the age of 18 living\
|
| 121 |
-
\ with them, 56.40% were married couples living together, 9.50% had a female householder\
|
| 122 |
-
\ with no husband present, and 30.50% were non-families. 27.50% of all households\
|
| 123 |
-
\ were made up of individuals, and 13.40% had someone living alone who was 65\
|
| 124 |
-
\ years of age or older. The average household size was 2.31 and the average\
|
| 125 |
-
\ family size was 2.80.\n\nIn the county, the population was spread out, with\
|
| 126 |
-
\ 20.50% under the age of 18, 7.50% from 18 to 24, 27.50% from 25 to 44, 26.50%\
|
| 127 |
-
\ from 45 to 64, and 18.10% who were 65 years of age or older. The median age\
|
| 128 |
-
\ was 42 years. For every 100 females, there were 95.10 males. For every 100\
|
| 129 |
-
\ females age 18 and over, there were 91.60 males.\n\nThe median income for a\
|
| 130 |
-
\ household in the county was $29,863, and the median income for a family was\
|
| 131 |
-
\ $36,871. Males had a median income of $30,206 versus $20,379 for females. The\
|
| 132 |
-
\ per capita income for the county was $15,612. About 8.70% of families and 13.10%\
|
| 133 |
-
\ of the population were below the poverty line, including 17.70% of those under\
|
| 134 |
-
\ age 18 and 13.50% of those age 65 or over.\n\nCommunities\n\nTowns\nErwin (county\
|
| 135 |
-
\ seat)\nUnicoi\n\nCensus-designated place\nBanner Hill\n\nUnincorporated communities\n\
|
| 136 |
-
Bumpus Cove (partial)\n Clearbranch\nFlag Pond\nLimestone Cove\n Shallowford\n\
|
| 137 |
-
\nPolitics\nUnicoi County, like most of eastern Tennessee, is heavily Republican\
|
| 138 |
-
\ and has been since the Civil War. Since its founding, it has supported the Republican\
|
| 139 |
-
\ presidential candidate in all but one election (1912, when it backed Theodore\
|
| 140 |
-
\ Roosevelt's Progressive Party campaign).\n\nAt the state level, Unicoi County\
|
| 141 |
-
\ has historically been slightly more receptive to Democratic candidates, generally\
|
| 142 |
-
\ when they win by landslides. It often supported Democratic candidates for governor\
|
| 143 |
-
\ in the Solid South era. More recently, it backed Democrat Ned McWherter in the\
|
| 144 |
-
\ 1986 and 1990 gubernatorial elections and Phil Bredesen in 2006, when he won\
|
| 145 |
-
\ every county in the state.\n\nSee also\nNational Register of Historic Places\
|
| 146 |
-
\ listings in Unicoi County, Tennessee\n\nReferences\n\nExternal links\n\nOfficial\
|
| 147 |
-
\ website\nUnicoi County Chamber of Commerce\nUnicoi County Schools\nTNGenWeb\n\
|
| 148 |
-
\n \n1875 establishments in Tennessee\nPopulated places established in 1875\n\
|
| 149 |
-
Johnson City metropolitan area, Tennessee\nCounties of Appalachia\nSecond Amendment\
|
| 150 |
-
\ sanctuaries in Tennessee"
|
| 151 |
-
- "Sevier County is a county located in the U.S. state of Arkansas. As of the 2010\
|
| 152 |
-
\ census, the population was 17,058. The county seat is De Queen. Sevier County\
|
| 153 |
-
\ is Arkansas's 16th county, formed on October 17, 1828, and named for Ambrose\
|
| 154 |
-
\ Sevier, U.S. Senator from Arkansas. On November 3, 2020, voters in Sevier County,\
|
| 155 |
-
\ AR approved alcohol sales by a vote of 3,499 (67.31 percent) to 1,699 (32.69\
|
| 156 |
-
\ percent).\n\nHistory\nSevier County was organized on October 17, 1828, under\
|
| 157 |
-
\ legislative authority. It was formed from Hempstead and Miller Counties. Five\
|
| 158 |
-
\ days later on October 22, 1828, the legislature expanded the county's border,\
|
| 159 |
-
\ incorporating more land south of the Red River. Hempstead, Miller and Crawford\
|
| 160 |
-
\ Counties as well as the Choctaw Nation in Indian Territory bound Sevier County.\
|
| 161 |
-
\ The establishment of Sevier County became effective on November 1, 1828.\n \n\
|
| 162 |
-
The county seat has undergone several changes since Sevier County was organized.\
|
| 163 |
-
\ The first county seat was Paraclifta. In 1871, the Lockes donated of land.\
|
| 164 |
-
\ As a result, the county seat was moved to Lockesburg. In 1905, the county\
|
| 165 |
-
\ seat was again moved to De Queen. Sevier County is known as \"The Land of Lakes\"\
|
| 166 |
-
, \"The Land of Fruits and Flowers\" and \"The Home of Friendly People\". The\
|
| 167 |
-
\ county has five lakes within a radius, five rivers and mountain streams and\
|
| 168 |
-
\ forests.\n\nGeography\nAccording to the U.S. Census Bureau, the county has a\
|
| 169 |
-
\ total area of , of which is land and (2.8%) is water.\n\nNotable people\n\
|
| 170 |
-
Current or former residents of Sevier County include:\nCollin Raye, country music\
|
| 171 |
-
\ singer.\nWes Watkins, U.S.Congressman (Republican- Oklahoma) lived for a time\
|
| 172 |
-
\ in De Queen as a child.\n\nMajor highways\n Future Interstate 49\n U.S. Highway\
|
| 173 |
-
\ 59\n U.S. Highway 70\n U.S. Highway 71\n U.S. Highway 371\n Highway 24\n Highway\
|
| 174 |
-
\ 27\n Highway 41\n\nAdjacent counties\nPolk County (north)\nHoward County (east)\n\
|
| 175 |
-
Hempstead County (southeast)\nLittle River County (south)\nMcCurtain County, Oklahoma\
|
| 176 |
-
\ (west)\n\nNational protected area\n Pond Creek National Wildlife Refuge\n\n\
|
| 177 |
-
Demographics\n\n2020 census\n\nAs of the 2020 United States census, there were\
|
| 178 |
-
\ 15,839 people, 5,885 households, and 4,279 families residing in the county.\n\
|
| 179 |
-
\n2000 census\nAs of the 2000 census, there were 15,757 people, 5,708 households,\
|
| 180 |
-
\ and 4,223 families residing in the county. The population density was 28 people\
|
| 181 |
-
\ per square mile (11/km2). There were 6,434 housing units at an average density\
|
| 182 |
-
\ of 11 per square mile (4/km2). The racial makeup of the county was 79.61% White,\
|
| 183 |
-
\ 4.94% Black or African American, 1.82% Native American, 0.13% Asian, 0.06% Pacific\
|
| 184 |
-
\ Islander, 11.84% from other races, and 1.61% from two or more races. 19.72%\
|
| 185 |
-
\ of the population were Hispanic or Latino of any race. 17.32% reported speaking\
|
| 186 |
-
\ Spanish at home.\n\nThere were 5,708 households, out of which 36.40% had children\
|
| 187 |
-
\ under the age of 18 living with them, 59.30% were married couples living together,\
|
| 188 |
-
\ 10.00% had a female householder with no husband present, and 26.00% were non-families.\
|
| 189 |
-
\ 22.80% of all households were made up of individuals, and 11.00% had someone\
|
| 190 |
-
\ living alone who was 65 years of age or older. The average household size was\
|
| 191 |
-
\ 2.73 and the average family size was 3.19.\n\nIn the county, the population\
|
| 192 |
-
\ was spread out, with 28.20% under the age of 18, 9.50% from 18 to 24, 27.70%\
|
| 193 |
-
\ from 25 to 44, 21.30% from 45 to 64, and 13.20% who were 65 years of age or\
|
| 194 |
-
\ older. The median age was 34 years. For every 100 females there were 99.10 males.\
|
| 195 |
-
\ For every 100 females age 18 and over, there were 97.00 males.\n\nThe median\
|
| 196 |
-
\ income for a household in the county was $30,144, and the median income for\
|
| 197 |
-
\ a family was $34,560. Males had a median income of $25,709 versus $17,666 for\
|
| 198 |
-
\ females. The per capita income for the county was $14,122. About 14.40% of families\
|
| 199 |
-
\ and 19.20% of the population were below the poverty line, including 26.90% of\
|
| 200 |
-
\ those under age 18 and 14.20% of those age 65 or over.\n\nGovernment\nOver the\
|
| 201 |
-
\ past few election cycles, Sevier County has trended heavily towards the GOP.\
|
| 202 |
-
\ The last Democrat (as of 2020) to carry this county was Arkansas native Bill\
|
| 203 |
-
\ Clinton in 1996.\n\nCommunities\n\nCities\n De Queen (county seat)\n Horatio\n\
|
| 204 |
-
\ Lockesburg\n\nTowns\n Ben Lomond\n Gillham\n\nTownships\n\n Bear Creek (contains\
|
| 205 |
-
\ most of De Queen)\n Ben Lomond (contains Ben Lomond)\n Buckhorn\n Clear Creek\
|
| 206 |
-
\ (contains Horatio)\n Jefferson\n Mill Creek\n Mineral (contains Gillham)\n Monroe\
|
| 207 |
-
\ (contains small part of De Queen)\n Paraclifta\n Red Colony (contains Lockesburg)\n\
|
| 208 |
-
\ Saline\n Washington\n\nSource:\n\nSee also\n List of lakes in Sevier County,\
|
| 209 |
-
\ Arkansas\n National Register of Historic Places listings in Sevier County, Arkansas\n\
|
| 210 |
-
\nReferences\n\nExternal links\n Sevier County, Arkansas entry on the Encyclopedia\
|
| 211 |
-
\ of Arkansas History & Culture\n\n \n1828 establishments in Arkansas Territory\n\
|
| 212 |
-
Populated places established in 1828"
|
| 213 |
-
- source_sentence: What is it called if you mistake a reflection in a mirror for the
|
| 214 |
-
real thing?
|
| 215 |
-
sentences:
|
| 216 |
-
- Whitehead describes causal efficacy as "the experience dominating the primitive
|
| 217 |
-
living organisms, which have a sense for the fate from which they have emerged,
|
| 218 |
-
and the fate towards which they go." It is, in other words, the sense of causal
|
| 219 |
-
relations between entities, a feeling of being influenced and affected by the
|
| 220 |
-
surrounding environment, unmediated by the senses. Presentational immediacy, on
|
| 221 |
-
the other hand, is what is usually referred to as "pure sense perception", unmediated
|
| 222 |
-
by any causal or symbolic interpretation, even unconscious interpretation. In
|
| 223 |
-
other words, it is pure appearance, which may or may not be delusive (e.g. mistaking
|
| 224 |
-
an image in a mirror for "the real thing").
|
| 225 |
-
- Even prior to the penetration of European interests, Southeast Asia was a critical
|
| 226 |
-
part of the world trading system. A wide range of commodities originated in the
|
| 227 |
-
region, but especially important were spices such as pepper, ginger, cloves, and
|
| 228 |
-
nutmeg. The spice trade initially was developed by Indian and Arab merchants,
|
| 229 |
-
but it also brought Europeans to the region. First Spaniards (Manila galleon)
|
| 230 |
-
and Portuguese, then the Dutch, and finally the British and French became involved
|
| 231 |
-
in this enterprise in various countries. The penetration of European commercial
|
| 232 |
-
interests gradually evolved into annexation of territories, as traders lobbied
|
| 233 |
-
for an extension of control to protect and expand their activities. As a result,
|
| 234 |
-
the Dutch moved into Indonesia, the British into Malaya and parts of Borneo, the
|
| 235 |
-
French into Indochina, and the Spanish and the US into the Philippines.
|
| 236 |
-
- Other important industries are financial services, especially mutual funds and
|
| 237 |
-
insurance. Boston-based Fidelity Investments helped popularize the mutual fund
|
| 238 |
-
in the 1980s and has made Boston one of the top financial cities in the United
|
| 239 |
-
States. The city is home to the headquarters of Santander Bank, and Boston is
|
| 240 |
-
a center for venture capital firms. State Street Corporation, which specializes
|
| 241 |
-
in asset management and custody services, is based in the city. Boston is a printing
|
| 242 |
-
and publishing center — Houghton Mifflin Harcourt is headquartered within the
|
| 243 |
-
city, along with Bedford-St. Martin's Press and Beacon Press. Pearson PLC publishing
|
| 244 |
-
units also employ several hundred people in Boston. The city is home to three
|
| 245 |
-
major convention centers—the Hynes Convention Center in the Back Bay, and the
|
| 246 |
-
Seaport World Trade Center and Boston Convention and Exhibition Center on the
|
| 247 |
-
South Boston waterfront. The General Electric Corporation announced in January
|
| 248 |
-
2016 its decision to move the company's global headquarters to the Seaport District
|
| 249 |
-
in Boston, from Fairfield, Connecticut, citing factors including Boston's preeminence
|
| 250 |
-
in the realm of higher education.
|
| 251 |
-
- source_sentence: Terry David is known for his campaign for his son, David, who was
|
| 252 |
-
detained at which detention center?
|
| 253 |
-
sentences:
|
| 254 |
-
- David Wade (Louisiana) David Wade (June 15, 1911 – May 11, 1990) was a decorated
|
| 255 |
-
American lieutenant general from three wars who after military retirement on March
|
| 256 |
-
1, 1967, served in two appointed positions in the state government of his native
|
| 257 |
-
Louisiana. The David Wade Correctional Center, a prison in Claiborne Parish, is
|
| 258 |
-
named in his honor.
|
| 259 |
-
- The Older Ones The Older Ones is the first compilation album by Norwegian blackened
|
| 260 |
-
death metal band Old Funeral, which was made up by key players in the Norwegian
|
| 261 |
-
black metal scene, including bassist/vocalist Olve "Abbath" Eikemo (Immortal),
|
| 262 |
-
guitarist Harald "Demonaz" Nævdal (Immortal) and guitarist Kristian "Varg" Vikernes
|
| 263 |
-
(Burzum). By the time this album was released, the members had already gone their
|
| 264 |
-
separate ways, with Immortal a going concern for Abbath and Varg in jail.
|
| 265 |
-
- David Hicks David Matthew Hicks (born 7 August 1975) is an Australian who was
|
| 266 |
-
detained by the United States in Guantanamo Bay detention camp from 2001 until
|
| 267 |
-
2007. He had attended the Al Farouq training camp para-military training in Afghanistan
|
| 268 |
-
during 2001.
|
| 269 |
-
- source_sentence: Is Bare-Metal Stent Implantation Still Justifiable in High Bleeding
|
| 270 |
-
Risk Patients Undergoing Percutaneous Coronary Intervention?
|
| 271 |
-
sentences:
|
| 272 |
-
- Primary percutaneous coronary interventions (PPCI) with short DTB time offer mortality
|
| 273 |
-
benefit for ST-segment elevation myocardial infarction but literatures are conflicting
|
| 274 |
-
on this benefit for high- vs. low-risk patients. In a unique model at Sandwell
|
| 275 |
-
and West Birmingham Hospitals, five interventional cardiologists provide 24-h
|
| 276 |
-
PPCI at whichever one of its two DGH that patients present to. A retrospective
|
| 277 |
-
audit was performed on 3 years (July 2005-June 2008) of PPCI data in the British
|
| 278 |
-
Cardiovascular Intervention Society database. Data were analysed in four periods
|
| 279 |
-
corresponding to change from daytime-only to 24-h PPCI. DTB time and in-hospital
|
| 280 |
-
mortality were the main outcome measures.
|
| 281 |
-
- 'Compared with patients without, those with 1 or more HBR criteria had worse outcomes,
|
| 282 |
-
owing to higher ischemic and bleeding risks. Among HBR patients, major adverse
|
| 283 |
-
cardiovascular events occurred in 22.6% of the E-ZES and 29% of the BMS patients
|
| 284 |
-
(hazard ratio: 0.75; 95% confidence interval: 0.57 to 0.98; p = 0.033), driven
|
| 285 |
-
by lower myocardial infarction (3.5% vs. 10.4%; p<0.001) and target vessel revascularization
|
| 286 |
-
(5.9% vs. 11.4%; p = 0.005) rates in the E-ZES arm. The composite of definite
|
| 287 |
-
or probable stent thrombosis was significantly reduced in E-ZES recipients, whereas
|
| 288 |
-
bleeding events did not differ between stent groups.'
|
| 289 |
-
- The management of noncorrectable extra hepatic biliary atresia includes portoenterostomy,
|
| 290 |
-
although the results of the surgery are variable. This study was done to develop
|
| 291 |
-
criteria that could successfully predict the outcome of surgery based on preoperative
|
| 292 |
-
data, including percutaneous liver biopsy, allowing a more selective approach
|
| 293 |
-
to the care of these babies.
|
| 294 |
-
- source_sentence: Empirical Study of Capsule An-di-er(安迪尔胶囊) on Slow Arrhythmic Prevention
|
| 295 |
-
sentences:
|
| 296 |
-
- 'Objective: To approach the effect of Capsule An-di-er on slow arrhythmic prevention.
|
| 297 |
-
Method: 50 rats were divided into 5 groups randomly, which were model group, positive
|
| 298 |
-
control group (Pellet Xinbao), Capsule An-di-er low dose group, midium dose group
|
| 299 |
-
and high dose group. Administer by intragastric administration for 7 days. After
|
| 300 |
-
administering 2 hours last time, Propranolol according to 5mg/kg was injected
|
| 301 |
-
by intraperitoneal injection. Then record the heart rate at 2, 5, 10 and 20min.
|
| 302 |
-
Result: The heart rate in Capsule An-di-er midium dose group decreased less than
|
| 303 |
-
in model group (P0.05), and that in Capsule An-di-er high dose group decreased
|
| 304 |
-
less than in model group remarkably (P0.01). Conclusion: Capsule An-di-er may
|
| 305 |
-
have the effect of activating adrenoreceptor and enhancing catechol amine to deliver.'
|
| 306 |
-
- We show a Kalton-Weis type theorem for the general case of non-commuting operators.
|
| 307 |
-
More precisely, we consider sums of two possibly non-commuting linear operators
|
| 308 |
-
defined in a Banach space such that one of the operators admits a bounded $H^\infty$-calculus,
|
| 309 |
-
the resolvent of the other one satisfies some weaker boundedness condition and
|
| 310 |
-
the commutator of their resolvents has certain decay behavior with respect to
|
| 311 |
-
the spectral parameters. Under this consideration, we show that the sum is closed
|
| 312 |
-
and that after a sufficiently large positive shift it becomes invertible, and
|
| 313 |
-
moreover sectorial. As an application we recover a classical result on the existence,
|
| 314 |
-
uniqueness and maximal $L^{p}$-regularity for solutions of the abstract linear
|
| 315 |
-
non-autonomous parabolic problem.
|
| 316 |
-
- 'Abstract : A computing program STLPLT is described which allows the plot of stereographic,
|
| 317 |
-
stereognomonic or gnomonic projection from the x, y coordinates of the Laue spots
|
| 318 |
-
measured in millimeters in the film. The cylindrical, flat transmission and flat
|
| 319 |
-
back-reflection Laue techniques can be used. The selected projection is plotted
|
| 320 |
-
in a circle of 100 mm. radius for any desired radius of the reference sphere.
|
| 321 |
-
The blind zones of the experimental record are also plotted in the projection.
|
| 322 |
-
The program is written in FORTRAN-IV for IBM 7074 and generates a tape to be used
|
| 323 |
-
in a CalComp plotter. (Author)'
|
| 324 |
datasets:
|
| 325 |
- sentence-transformers/squad
|
| 326 |
- sentence-transformers/trivia-qa-triplet
|
|
@@ -1084,47 +774,62 @@ model-index:
|
|
| 1084 |
name: Cosine Map@100
|
| 1085 |
---
|
| 1086 |
|
| 1087 |
-
# SSE Retrieval MRL
|
| 1088 |
-
|
| 1089 |
-
|
| 1090 |
-
|
| 1091 |
-
|
| 1092 |
-
|
| 1093 |
-
### Model Description
|
| 1094 |
-
- **Model Type:** Sentence Transformer
|
| 1095 |
-
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
|
| 1096 |
-
- **Maximum Sequence Length:** inf tokens
|
| 1097 |
-
- **Output Dimensionality:** 512 dimensions
|
| 1098 |
-
- **Similarity Function:** Cosine Similarity
|
| 1099 |
-
- **Training Datasets:**
|
| 1100 |
-
- [squad](https://huggingface.co/datasets/sentence-transformers/squad)
|
| 1101 |
-
- [trivia_qa](https://huggingface.co/datasets/sentence-transformers/trivia-qa-triplet)
|
| 1102 |
-
- [allnli](https://huggingface.co/datasets/sentence-transformers/all-nli)
|
| 1103 |
-
- [pubmedqa](https://huggingface.co/datasets/sentence-transformers/pubmedqa)
|
| 1104 |
-
- [hotpotqa](https://huggingface.co/datasets/sentence-transformers/hotpotqa)
|
| 1105 |
-
- [miracl](https://huggingface.co/datasets/sentence-transformers/miracl)
|
| 1106 |
-
- [mr_tydi](https://huggingface.co/datasets/sentence-transformers/mr-tydi)
|
| 1107 |
-
- msmarco
|
| 1108 |
-
- msmarco_10m
|
| 1109 |
-
- msmarco_hard
|
| 1110 |
-
- mldr
|
| 1111 |
-
- [s2orc](https://huggingface.co/datasets/sentence-transformers/s2orc)
|
| 1112 |
-
- [swim_ir](https://huggingface.co/datasets/nthakur/swim-ir-monolingual)
|
| 1113 |
-
- [paq](https://huggingface.co/datasets/sentence-transformers/paq)
|
| 1114 |
-
- [nq](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives)
|
| 1115 |
-
- scidocs
|
| 1116 |
-
- **Language:** en
|
| 1117 |
-
- **License:** mit
|
| 1118 |
-
|
| 1119 |
-
### Model Sources
|
| 1120 |
-
|
| 1121 |
-
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| 1122 |
-
- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
|
| 1123 |
-
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
|
| 1124 |
-
|
| 1125 |
-
### Full Model Architecture
|
| 1126 |
|
| 1127 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1128 |
SentenceTransformer(
|
| 1129 |
(0): SSE(
|
| 1130 |
(embedding): EmbeddingBag(30522, 512, mode='mean')
|
|
@@ -1133,147 +838,144 @@ SentenceTransformer(
|
|
| 1133 |
)
|
| 1134 |
```
|
| 1135 |
|
| 1136 |
-
|
| 1137 |
|
| 1138 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1139 |
|
| 1140 |
-
|
| 1141 |
|
| 1142 |
-
|
| 1143 |
-
|
| 1144 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1145 |
|
| 1146 |
-
Then you can load this model and run inference.
|
| 1147 |
```python
|
| 1148 |
from sentence_transformers import SentenceTransformer
|
| 1149 |
|
| 1150 |
-
# Download
|
| 1151 |
-
model = SentenceTransformer("
|
| 1152 |
-
|
| 1153 |
-
queries
|
| 1154 |
-
|
| 1155 |
-
]
|
| 1156 |
documents = [
|
| 1157 |
-
|
| 1158 |
-
|
| 1159 |
-
|
| 1160 |
]
|
| 1161 |
-
|
| 1162 |
-
|
| 1163 |
-
|
| 1164 |
-
|
| 1165 |
-
|
| 1166 |
-
|
| 1167 |
-
similarities
|
| 1168 |
-
print(similarities)
|
| 1169 |
-
# tensor([[ 0.5623, -0.0658, -0.0888]])
|
| 1170 |
```
|
| 1171 |
|
| 1172 |
-
|
| 1173 |
-
|
|
|
|
|
|
|
| 1174 |
|
| 1175 |
-
|
| 1176 |
|
| 1177 |
-
|
| 1178 |
-
-->
|
| 1179 |
|
| 1180 |
-
|
| 1181 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1182 |
|
| 1183 |
-
|
| 1184 |
|
| 1185 |
-
|
| 1186 |
|
| 1187 |
-
|
| 1188 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1189 |
|
| 1190 |
-
|
| 1191 |
-
### Out-of-Scope Use
|
| 1192 |
|
| 1193 |
-
|
| 1194 |
-
-->
|
| 1195 |
|
| 1196 |
-
|
| 1197 |
-
|
| 1198 |
-
### Metrics
|
| 1199 |
-
|
| 1200 |
-
#### Information Retrieval
|
| 1201 |
-
|
| 1202 |
-
* Datasets: `NanoClimateFEVER`, `NanoDBPedia`, `NanoFEVER`, `NanoFiQA2018`, `NanoHotpotQA`, `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoQuoraRetrieval`, `NanoSCIDOCS`, `NanoArguAna`, `NanoSciFact` and `NanoTouche2020`
|
| 1203 |
-
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
|
| 1204 |
-
|
| 1205 |
-
| Metric | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
|
| 1206 |
-
|:--------------------|:-----------------|:------------|:-----------|:-------------|:-------------|:------------|:-------------|:-----------|:-------------------|:------------|:------------|:------------|:---------------|
|
| 1207 |
-
| cosine_accuracy@1 | 0.2 | 0.66 | 0.46 | 0.32 | 0.64 | 0.24 | 0.38 | 0.24 | 0.86 | 0.46 | 0.14 | 0.54 | 0.6531 |
|
| 1208 |
-
| cosine_accuracy@3 | 0.48 | 0.84 | 0.76 | 0.48 | 0.88 | 0.46 | 0.56 | 0.52 | 0.98 | 0.62 | 0.5 | 0.6 | 0.9184 |
|
| 1209 |
-
| cosine_accuracy@5 | 0.54 | 0.84 | 0.82 | 0.58 | 0.94 | 0.52 | 0.6 | 0.62 | 0.98 | 0.68 | 0.56 | 0.66 | 0.9592 |
|
| 1210 |
-
| cosine_accuracy@10 | 0.68 | 0.9 | 0.92 | 0.62 | 0.96 | 0.6 | 0.76 | 0.7 | 1.0 | 0.76 | 0.7 | 0.74 | 1.0 |
|
| 1211 |
-
| cosine_precision@1 | 0.2 | 0.66 | 0.46 | 0.32 | 0.64 | 0.24 | 0.38 | 0.24 | 0.86 | 0.46 | 0.14 | 0.54 | 0.6531 |
|
| 1212 |
-
| cosine_precision@3 | 0.18 | 0.5667 | 0.2533 | 0.2267 | 0.42 | 0.1533 | 0.3467 | 0.1733 | 0.38 | 0.2933 | 0.1667 | 0.2133 | 0.6395 |
|
| 1213 |
-
| cosine_precision@5 | 0.128 | 0.52 | 0.172 | 0.176 | 0.296 | 0.104 | 0.296 | 0.124 | 0.236 | 0.252 | 0.112 | 0.144 | 0.6245 |
|
| 1214 |
-
| cosine_precision@10 | 0.102 | 0.444 | 0.096 | 0.102 | 0.16 | 0.06 | 0.246 | 0.074 | 0.124 | 0.162 | 0.07 | 0.082 | 0.5551 |
|
| 1215 |
-
| cosine_recall@1 | 0.1017 | 0.0783 | 0.4367 | 0.1862 | 0.32 | 0.24 | 0.0339 | 0.23 | 0.7707 | 0.0967 | 0.14 | 0.505 | 0.0445 |
|
| 1216 |
-
| cosine_recall@3 | 0.2417 | 0.1603 | 0.7167 | 0.3213 | 0.63 | 0.46 | 0.0646 | 0.5 | 0.932 | 0.1817 | 0.5 | 0.58 | 0.1288 |
|
| 1217 |
-
| cosine_recall@5 | 0.2733 | 0.2095 | 0.7867 | 0.3895 | 0.74 | 0.52 | 0.0773 | 0.6 | 0.9453 | 0.2607 | 0.56 | 0.645 | 0.2023 |
|
| 1218 |
-
| cosine_recall@10 | 0.3923 | 0.2983 | 0.8867 | 0.4547 | 0.8 | 0.6 | 0.1083 | 0.69 | 0.9627 | 0.3347 | 0.7 | 0.735 | 0.3514 |
|
| 1219 |
-
| **cosine_ndcg@10** | **0.2998** | **0.5493** | **0.6808** | **0.3744** | **0.7021** | **0.4132** | **0.2982** | **0.4652** | **0.9094** | **0.3381** | **0.4105** | **0.6176** | **0.6029** |
|
| 1220 |
-
| cosine_mrr@10 | 0.3611 | 0.7492 | 0.6318 | 0.4197 | 0.7679 | 0.3537 | 0.4889 | 0.3992 | 0.9122 | 0.5509 | 0.3193 | 0.5933 | 0.7852 |
|
| 1221 |
-
| cosine_map@100 | 0.2344 | 0.4247 | 0.6105 | 0.3162 | 0.6273 | 0.3733 | 0.1091 | 0.4028 | 0.8847 | 0.2604 | 0.3325 | 0.5824 | 0.4539 |
|
| 1222 |
-
|
| 1223 |
-
#### Nano BEIR
|
| 1224 |
-
|
| 1225 |
-
* Dataset: `NanoBEIR_mean`
|
| 1226 |
-
* Evaluated with [<code>NanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.NanoBEIREvaluator) with these parameters:
|
| 1227 |
-
```json
|
| 1228 |
-
{
|
| 1229 |
-
"dataset_names": [
|
| 1230 |
-
"climatefever",
|
| 1231 |
-
"dbpedia",
|
| 1232 |
-
"fever",
|
| 1233 |
-
"fiqa2018",
|
| 1234 |
-
"hotpotqa",
|
| 1235 |
-
"msmarco",
|
| 1236 |
-
"nfcorpus",
|
| 1237 |
-
"nq",
|
| 1238 |
-
"quoraretrieval",
|
| 1239 |
-
"scidocs",
|
| 1240 |
-
"arguana",
|
| 1241 |
-
"scifact",
|
| 1242 |
-
"touche2020"
|
| 1243 |
-
],
|
| 1244 |
-
"dataset_id": "sentence-transformers/NanoBEIR-en"
|
| 1245 |
-
}
|
| 1246 |
-
```
|
| 1247 |
|
| 1248 |
-
|
| 1249 |
-
|
| 1250 |
-
|
| 1251 |
-
|
| 1252 |
-
|
| 1253 |
-
|
| 1254 |
-
|
| 1255 |
-
|
| 1256 |
-
| cosine_precision@5 | 0.245 |
|
| 1257 |
-
| cosine_precision@10 | 0.1752 |
|
| 1258 |
-
| cosine_recall@1 | 0.2449 |
|
| 1259 |
-
| cosine_recall@3 | 0.4167 |
|
| 1260 |
-
| cosine_recall@5 | 0.4777 |
|
| 1261 |
-
| cosine_recall@10 | 0.5626 |
|
| 1262 |
-
| **cosine_ndcg@10** | **0.5124** |
|
| 1263 |
-
| cosine_mrr@10 | 0.564 |
|
| 1264 |
-
| cosine_map@100 | 0.4317 |
|
| 1265 |
|
| 1266 |
-
|
| 1267 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1268 |
|
| 1269 |
-
|
| 1270 |
-
-->
|
| 1271 |
|
| 1272 |
-
|
| 1273 |
-
|
|
|
|
|
|
|
| 1274 |
|
| 1275 |
-
|
| 1276 |
-
-->
|
| 1277 |
|
| 1278 |
## Training Details
|
| 1279 |
|
|
|
|
| 11 |
- dataset_size:10884622
|
| 12 |
- loss:MatryoshkaLoss
|
| 13 |
- loss:MultipleNegativesRankingLoss
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
datasets:
|
| 15 |
- sentence-transformers/squad
|
| 16 |
- sentence-transformers/trivia-qa-triplet
|
|
|
|
| 774 |
name: Cosine Map@100
|
| 775 |
---
|
| 776 |
|
| 777 |
+
# 💙 SSE: Stable Static Embedding for Retrieval MRL 💙
|
| 778 |
+
### *A lightweight, fatser and powerful embedding model*✨
|
| 779 |
+
|
| 780 |
+
🌸 **Performance Snapshot** 🌸
|
| 781 |
+
Our SSE model achieves **NDCG@10 = 0.5124** on NanoBEIR — *slightly outperforming* the popular `static-retrieval-mrl-en-v1` (0.5032) while using **half the dimensions** (512 vs 1024)! 💫 Plus, we're **~2× faster** in retrieval thanks to our compact 512D embeddings and Separable Dynamic Tanh.💙
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 782 |
|
| 783 |
+
| Model | NanoBEIR NDCG@10 | Dimensions | Parameters | Speed Advantage | License |
|
| 784 |
+
|-------|------------------|------------|------------|-----------------|---------|
|
| 785 |
+
| **SSE Retrieval MRL** | **0.5124** ✨ | **512** | **~16M** 🪽 | **~2x faster retrieval** (ultra-efficient!) | Apache 2.0 |
|
| 786 |
+
| `static-retrieval-mrl-en-v1` | 0.5032 | 1024 | ~33M | baseline | Apache 2.0 |
|
| 787 |
+
| `bge-small-en-v1.5` | 0.4987 | 384 | 33M | 397× slower inference | MIT |
|
| 788 |
+
| `all-MiniLM-L6-v2` | 0.4821 | 384 | 22M | 397× slower inference | Apache 2.0 |
|
| 789 |
+
| `gte-small` | 0.4795 | 384 | 33M | 397× slower inference | MIT |
|
| 790 |
+
| `all-mpnet-base-v2` | 0.5757 | 768 | 110M | 397× slower on CPU 😴 | Apache 2.0 |
|
| 791 |
+
|
| 792 |
+
> 💡 **Key Insight:** Our model delivers **better accuracy than all 384D small models** while using **512D for richer representation**, yet remains **lighter than MiniLM-L6** (16M vs 22M params)! perfect for mobile apps! 📱
|
| 793 |
+
|
| 794 |
+
---
|
| 795 |
+
|
| 796 |
+
## 💙 **Why Choose SSE Retrieval MRL?** 💙
|
| 797 |
+
|
| 798 |
+
✅ **Higher NDCG@10** than all comparable small models (<35M params)
|
| 799 |
+
✅ **Only ~16M parameters** — 27% smaller than MiniLM-L6 (22M) and 52% smaller than BGE-small (33M)
|
| 800 |
+
✅ **512D native output** — richer than 384D models, yet **half the size** of static-retrieval-mrl-en-v1 (1024D)
|
| 801 |
+
✅ **Matryoshka-ready** — smoothly truncate to 256D/128D/64D/32D with graceful degradation
|
| 802 |
+
✅ **MIT licensed** — free for commercial & personal use 🌼
|
| 803 |
+
✅ **CPU-optimized** — runs beautifully on edge devices & modest hardware 💻
|
| 804 |
+
|
| 805 |
+
---
|
| 806 |
+
|
| 807 |
+
## 🌸 What is this model? 💙
|
| 808 |
+
|
| 809 |
+
A **sentence-transformers** model trained with Matryoshka magic ✨ to map sentences & paragraphs into a cozy **512-dimensional** vector space. Designed for:
|
| 810 |
+
|
| 811 |
+
- 💖 Semantic search that *just gets you*
|
| 812 |
+
- 💖 Paraphrase mining with gentle precision
|
| 813 |
+
- 💖 Text clustering that feels like organizing your diary
|
| 814 |
+
- 💖 Classification tasks with soft confidence
|
| 815 |
+
|
| 816 |
+
Trained on **10.8M+ samples** across 14 diverse datasets — from trivia questions to medical abstracts — all wrapped in a pastel-efficient architecture! 🌈
|
| 817 |
+
|
| 818 |
+
---
|
| 819 |
+
|
| 820 |
+
## 🎀 Model Details 💙
|
| 821 |
+
|
| 822 |
+
| Property | Value |
|
| 823 |
+
|----------|-------|
|
| 824 |
+
| **Model Type** | Sentence Transformer (SSE architecture) |
|
| 825 |
+
| **Max Sequence Length** | ∞ tokens (yes, really! ✨) |
|
| 826 |
+
| **Output Dimension** | 512 (with Matryoshka truncation down to 32D!) |
|
| 827 |
+
| **Similarity Function** | Cosine Similarity 💫 |
|
| 828 |
+
| **Language** | English 🇬🇧 |
|
| 829 |
+
| **License** | MIT (free as a daisy! 🌼) |
|
| 830 |
+
|
| 831 |
+
```python
|
| 832 |
+
# Our dreamy architecture 💙
|
| 833 |
SentenceTransformer(
|
| 834 |
(0): SSE(
|
| 835 |
(embedding): EmbeddingBag(30522, 512, mode='mean')
|
|
|
|
| 838 |
)
|
| 839 |
```
|
| 840 |
|
| 841 |
+
---
|
| 842 |
|
| 843 |
+
## 🌼 Training Datasets 💙
|
| 844 |
+
|
| 845 |
+
We learned from **14 datasets**:
|
| 846 |
+
|
| 847 |
+
| Dataset | Special Flavor |
|
| 848 |
+
|---------|----------------|
|
| 849 |
+
| `squad` | Q&A pairs with gentle context |
|
| 850 |
+
| `trivia_qa` | Fun facts & brain teasers 🧠 |
|
| 851 |
+
| `allnli` | Logical reasoning with care |
|
| 852 |
+
| `pubmedqa` | Medical wisdom 🩺 |
|
| 853 |
+
| `hotpotqa` | Multi-hop reasoning adventures |
|
| 854 |
+
| `miracl` | Cross-lingual curiosity 🌍 |
|
| 855 |
+
| `mr_tydi` | Global question answering |
|
| 856 |
+
| `msmarco` | Real search queries 💭 |
|
| 857 |
+
| `msmarco_10m` | Massive-scale search love |
|
| 858 |
+
| `msmarco_hard` | Tricky negatives for growth 💪 |
|
| 859 |
+
| `mldr` | Long-document cuddles 📚 |
|
| 860 |
+
| `s2orc` | Scientific paper whispers 📄 |
|
| 861 |
+
| `swim_ir` | Information retrieval elegance |
|
| 862 |
+
| `paq` | 64M+ question-answer pairs! ✨ |
|
| 863 |
+
| `nq` | Natural questions with heart |
|
| 864 |
+
| `scidocs` | Scientific document friendships |
|
| 865 |
+
|
| 866 |
+
*All trained with **MatryoshkaLoss** — learning representations at multiple scales like Russian nesting dolls! 🪆*
|
| 867 |
|
| 868 |
+
---
|
| 869 |
|
| 870 |
+
## 💙 Evaluation Results (NanoBEIR) 💙
|
| 871 |
+
|
| 872 |
+
| Dataset | NDCG@10 | MRR@10 | MAP@100 |
|
| 873 |
+
|---------|---------|--------|---------|
|
| 874 |
+
| **NanoBEIR Mean** | **0.5124** 💙 | **0.5640** | **0.4317** |
|
| 875 |
+
| NanoClimateFEVER | 0.2998 | 0.3611 | 0.2344 |
|
| 876 |
+
| NanoDBPedia | 0.5493 | 0.7492 | 0.4247 |
|
| 877 |
+
| NanoFEVER | 0.6808 | 0.6318 | 0.6105 |
|
| 878 |
+
| NanoFiQA2018 | 0.3744 | 0.4197 | 0.3162 |
|
| 879 |
+
| NanoHotpotQA | 0.7021 | 0.7679 | 0.6273 |
|
| 880 |
+
| NanoMSMARCO | 0.4132 | 0.3537 | 0.3733 |
|
| 881 |
+
| NanoNFCorpus | 0.2982 | 0.4889 | 0.1091 |
|
| 882 |
+
| NanoNQ | 0.4652 | 0.3992 | 0.4028 |
|
| 883 |
+
| NanoQuoraRetrieval | **0.9094** ✨ | **0.9122** | **0.8847** |
|
| 884 |
+
| NanoSCIDOCS | 0.3381 | 0.5509 | 0.2604 |
|
| 885 |
+
| NanoArguAna | 0.4105 | 0.3193 | 0.3325 |
|
| 886 |
+
| NanoSciFact | 0.6176 | 0.5933 | 0.5824 |
|
| 887 |
+
| NanoTouche2020 | 0.6029 | 0.7852 | 0.4539 |
|
| 888 |
+
|
| 889 |
+
> 💙 *Top performance on community-based retrieval (Quora) and scientific fact verification!*
|
| 890 |
+
|
| 891 |
+
---
|
| 892 |
+
|
| 893 |
+
## 🌸 Usage Example 💙
|
| 894 |
|
|
|
|
| 895 |
```python
|
| 896 |
from sentence_transformers import SentenceTransformer
|
| 897 |
|
| 898 |
+
# Download our pastel-powered model 💙
|
| 899 |
+
model = SentenceTransformer("your-model-id-here")
|
| 900 |
+
|
| 901 |
+
# Encode queries & documents
|
| 902 |
+
queries = ["What is the average household size in Unicoi County?"]
|
|
|
|
| 903 |
documents = [
|
| 904 |
+
"As of the 2000 census... the average household size was 2.31...",
|
| 905 |
+
"Forth Dimension Displays makes microdisplays for VR applications...",
|
| 906 |
+
"Sevier County, Arkansas has an average household size of 2.73..."
|
| 907 |
]
|
| 908 |
+
|
| 909 |
+
query_embeddings = model.encode(queries)
|
| 910 |
+
doc_embeddings = model.encode(documents)
|
| 911 |
+
|
| 912 |
+
# Get dreamy similarity scores 💫
|
| 913 |
+
similarities = model.similarity(query_embeddings, doc_embeddings)
|
| 914 |
+
print(similarities) # tensor([[0.82, -0.12, 0.45]]) → First doc wins! ✨
|
|
|
|
|
|
|
| 915 |
```
|
| 916 |
|
| 917 |
+
✨ **Pro tip:** Truncate to 256D for 2× faster retrieval with only ~3% NDCG drop!
|
| 918 |
+
```python
|
| 919 |
+
model = SentenceTransformer("model-id", truncate_dim=256)
|
| 920 |
+
```
|
| 921 |
|
| 922 |
+
---
|
| 923 |
|
| 924 |
+
## 🎀 Training Hyperparameters 💙
|
|
|
|
| 925 |
|
| 926 |
+
| Parameter | Value | Why it's cute |
|
| 927 |
+
|-----------|-------|---------------|
|
| 928 |
+
| **Batch Size** | 512 | Big batches = happy gradients! 🍰 |
|
| 929 |
+
| **Learning Rate** | 0.1 | Bold but gentle steps 💃 |
|
| 930 |
+
| **Optimizer** | AdamW (fused) | Efficient & eco-friendly 🌱 |
|
| 931 |
+
| **Loss** | Matryoshka + MNR Loss | Learning at all scales! 🪆 |
|
| 932 |
+
| **Epochs** | 1 | One perfect pass ✨ |
|
| 933 |
+
| **Scheduler** | Cosine w/ 10% warmup | Smooth learning curve 🌈 |
|
| 934 |
+
| **Precision** | bfloat16 | Efficient & precise 💙 |
|
| 935 |
+
| **Hardware** | 1× RTX 3090 | Cozy single-GPU training 🖥️ |
|
| 936 |
|
| 937 |
+
---
|
| 938 |
|
| 939 |
+
## 💙 Why Choose SSE Retrieval MRL? 💙
|
| 940 |
|
| 941 |
+
✅ **Higher NDCG@10** than static-retrieval-mrl-en-v1 (0.5124 vs 0.5032)
|
| 942 |
+
✅ **Half the dimensions** (512D vs 1024D) → faster retrieval & less storage 💾
|
| 943 |
+
✅ **Matryoshka-ready** → smoothly truncate to 256D/128D/64D/32D as needed
|
| 944 |
+
✅ **MIT licensed** → free for commercial & personal use 🌼
|
| 945 |
+
✅ **CPU-friendly** → runs beautifully even on modest hardware 💻
|
| 946 |
+
✅ **Trained on diverse data** → understands everything from medical papers to trivia! 📚✨
|
| 947 |
|
| 948 |
+
---
|
|
|
|
| 949 |
|
| 950 |
+
## 🌸 Citation 💙
|
|
|
|
| 951 |
|
| 952 |
+
If our model brings joy to your project, please cite:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 953 |
|
| 954 |
+
```bibtex
|
| 955 |
+
@inproceedings{reimers-2019-sentence-bert,
|
| 956 |
+
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
| 957 |
+
author = "Reimers, Nils and Gurevych, Iryna",
|
| 958 |
+
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
| 959 |
+
year = "2019",
|
| 960 |
+
url = "https://arxiv.org/abs/1908.10084",
|
| 961 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 962 |
|
| 963 |
+
@misc{kusupati2024matryoshka,
|
| 964 |
+
title={Matryoshka Representation Learning},
|
| 965 |
+
author={Kusupati et al.},
|
| 966 |
+
year={2024},
|
| 967 |
+
eprint={2205.13147},
|
| 968 |
+
}
|
| 969 |
+
```
|
| 970 |
|
| 971 |
+
---
|
|
|
|
| 972 |
|
| 973 |
+
<div align="center">
|
| 974 |
+
|
| 975 |
+
💙 *Made with love for efficient, accurate, and accessible semantic search* 💙
|
| 976 |
+
✨ *May your embeddings always be meaningful and your retrieval always gentle* ✨
|
| 977 |
|
| 978 |
+
</div>
|
|
|
|
| 979 |
|
| 980 |
## Training Details
|
| 981 |
|